Subject: character encoding


A new week and a new term from the XMLvoc PSI list!

character encoding


1. A character encoding form plus byte serialization. There are seven 
character encoding schemes in Unicode: UTF-8, UTF-16, UTF-16BE, 
UTF-16LE, UTF-32, UTF-32BE and UTF-32LE. (from the Unicode Consortium 
definition of character encoding scheme, noting that character encoding 
form is defined as: Mapping from a character set definition to the 
actual code units used to represent the data.)

FOLDOC reports (in part)

2. <character 
<http://wombat.doc.ic.ac.uk/foldoc/contents/character.html>> (Or 
"character encoding scheme") A mapping of binary 
<http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?binary> values to code 
positions <http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?code+positions> 
and back; generally a 1:1 (bijective 
<http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?bijective>) mapping.

Wilde's reports (for character encoding scheme):

3. A CES defines a mapping from a given set of characters (the CCS 
<http://wildesweb.com/glossary/ccs>) to encoded forms of the characters. 
Thus, there can be multiple CESs for one CCS 

(CCS = Coded Character Set: A CCS identifies a set of characters that 
are relevant and should be identifiable for some Character Set 
<http://wildesweb.com/glossary/characterset>. It does not, however, 
specify the actual encoding of these characters, which is done by one or 
more CES <http://wildesweb.com/glossary/ces>s.)

TechEncyclopedia: no results

Personally I like #1 since we are using Unicode and it seems 
presumptuous at best to re-define terms that have been defined by 
another standards body. Perhaps we should change the name of the entry 
to be character encoding scheme, so that it matches the Unicode glossary 
and simply use their definition.

Hope everyone is having a great day!


Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Co-Editor, ISO 13250, Topic Maps -- Reference Model

