OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

bdxr message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [OASIS Issue Tracker] (BDXR-22) Case sensitivity of string "UTF-8"


Pim van der Eijk created BDXR-22:
------------------------------------

             Summary: Case sensitivity of string "UTF-8" 
                 Key: BDXR-22
                 URL: https://issues.oasis-open.org/browse/BDXR-22
             Project: OASIS Business Document Exchange (BDXR) TC
          Issue Type: Bug
          Components: Documentation
    Affects Versions: SMP 1.0
            Reporter: Pim van der Eijk
            Priority: Minor


Section 3.3 of SMP states:

XML documents returned by HTTP GET MUST be well-formed according to [XML 1.0] and MUST be UTF-8 encoded ([Unicode]). They MUST contain an XML declaration starting with “<?xml” which includes the «encoding» attribute set to “UTF-8”.

This can be interpreted as implying that using the lower case string "utf-8" for the encoding would be incorrect.  There are a number of problems with this:

1)  All examples in the spec use "utf-8".  While it is true that the examples are marked as non-normative,  one would expect them to be consistent with the spec.

2) XML 1.0 states that XML processors SHOULD match character encoding names in a case-insensitive way.  

3) the IANA character set repository states that "character set names may be up to 40 characters taken from the printable characters of US-ASCII.  However, no distinction is made between use of upper and lower case letters."
https://www.iana.org/assignments/character-sets/character-sets.xhtml 

4) If no encoding is specified,  XML 1.0 assumes UTF-8 encoding. The attribute is only relevant is some other encoding (like UTF-16) would be used.

5)  XML has been around for two decades.  I doubt that any of the current versions of commonly used XML libraries would break if the non-all-uppercase variant is used.

Internet conventional wisdom suggests that the uppercase variant is preferred, because XML 1.0 uses SHOULD instead of MUST, but that both are allowed.  





--
This message was sent by Atlassian JIRA
(v6.2.2#6258)


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]