OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

sarif message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: SARIF must be UTF-8

We’ve previously discussed whether SARIF has to be encoded in UTF-8. The answer, it turns out, is “yes.”


As Stefan has correctly pointed out, the ECMA 404 JSON spec does not specify a text encoding for JSON. However, it does say the following:


It is expected that other standards will refer to this one, strictly adhering to the JSON syntax, while imposing semantics interpretation and restrictions on various encoding details. Such standards may require specific behaviours. JSON itself specifies no behaviour.


But RFC 8259, “The _javascript_ Object Notation (JSON) Data Interchange Format,” says this:


   JSON text exchanged between systems that are not part of a closed

   ecosystem MUST be encoded using UTF-8 [RFC3629].


   Previous specifications of JSON have not required the use of UTF-8

   when transmitting JSON text.  However, the vast majority of JSON-

   based software implementations have chosen to use the UTF-8 encoding,

   to the extent that it is the only encoding that achieves



Since SARIF documents are JSON text that is exchanged between systems that are not part of a closed ecosystem, RFC 8259 requires them to be encoded in UTF-8.


Even if RFC 8259 did not exist, the language in ECMA 404 means that SARIF, as a standard that refers to ECMA 404, would be entitled to require a specific encoding. But the existence of RFC 8259 makes it clear that we should do that.


This afternoon I’m writing a change draft to address a cluster of encoding-related issues (#76, #97, and #98). I will take the opportunity to add this requirement to the spec.




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]