OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

csaf message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [csaf] [CSAF JSON Schema] Follow-on discussion about language support

Several further notes on localization of CSAF JSON documents.

A fairly long discussion over at JSON schema about how to allow for localization of JSON Schema itself.

That conversation references JSON-LD localization:

It seems that the JSON community as a whole does not have one generalized answer to this question.

While I agree with Allan's point --> translation could happen simultaneously with the release of a CSAF document, in most workflows, it is likely to come later. For example, if we get buy-in from open source projects to share vulnerability information via CSAF, then a lot of open source projects will have the translations for individual languages come in one-by-one, after the fact. At least where I work, we would not delay release of vulnerability information just to have translations.

If a company publishes a CSAF document, my guess is that we as a community don't want that company re-release that file again unless it is an update to information about the vulnerability, not just a new translation.

My take-away. Even if we allow for multiple translations in the CSAF JSON format directly, it feels to me like we should also define how the translations can be accomplished in a separate file.


On Mon, May 7, 2018 at 8:22 PM, Allan Thomson <athomson@lookingglasscyber.com> wrote:

Eric –


  1. I disagree that a translation will take place after the original vulnerability was defined. Its entirely possible that orgs publishing vulnerabilities are capable of producing the text in multiple languages simultaneously.
  2. the exact requirements you are listing for localization for CSAF were (pretty much) the same set of requirements that STIXv2 had/has.


I suggest not re-inventing a new mechanism and consider using the approach already shared on a prior email regarding how it was done in STIXv2.


Vendors and orgs that will support both STIXv2 and CSAF will appreciate this consistency.




From: <csaf@lists.oasis-open.org> on behalf of Eric Johnson <eric@tibco.com>
Date: Monday, May 7, 2018 at 11:36 AM
To: "csaf@lists.oasis-open.org" <csaf@lists.oasis-open.org>
Subject: [csaf] [CSAF JSON Schema] Follow-on discussion about language support


Several further comments about language support in the schema.


I believe we should state the language in the document in some form. In my original email, I gave option (A), which presumes the question is simply out-of-scope. Nobody else has come forward in furtherance of that approach, so I suspect we should discard that option. However, I do think it is reasonable to state a default value for a language choice - so the schema may relay the default value, even if the value does not appear in the instance.


For completeness, here's where I see translation actually being a question:

  • /vulnerabilities[]/acknowledgments[]/description
  • /vulnerabilities[]/involvements[]/description
  • /vulnerabilities[]/notes[], 
  • /vulnerabilities[]/remediations[]/description
  • /vulnerabilities[]/threats[]/description
  • /vulnerabilities[]/title
  • /document_title
  • /document_distribution
  • /document/notes[]

Based on what I see and understand, I suspect that nothing in the "product_tree" should be localized. Anyone disagree?


As for performing localization, I come down on the side of localization data being outside of the CSAF document. I think this makes sense simply because it has a different life-cycle from the vulnerability data in the existing documents. That is, translations of vulnerability descriptions and notes about vulnerabilities will not be available at the same time as the vulnerabilities themselves. Translations are also likely to be incomplete, as not all information might be translated to all languages. However, It behooves us to specify how it works, so that we can guarantee localization is handled consistently.


The requirements for localization, that I can put my finger on:

  • Provide the ability to translate any of the relevant strings (enumerated above)
  • Allow the life-cycle of translations to be different from the vulnerability document itself
  • Allow for translation to arbitrary languages
  • Allow for third-parties to provide translations
  • Tool / process / data format / specification defined for validating the translation of a CSAF document in a new language. This may include generating a new version of the CSAF document in the translated form.
  • Work well with the current ecosystem of translation tooling.

Any other requirements? (I confess I've not been involved in localization in a long time, so I suspect I'm missing something.)



To externalize the translation implies one of two approaches:

  • Use a pointer into the document to identify a specific translation. Obvious candidate for this is JSON Pointer.
  • Labels in the document identifying a thing to be translated - those could be referenced outside the document. There are two options for this:
    • The text itself that needs to be localized (could be long!)
    • An extra property that establishes a label for the data to be localized.

My recommendations:

  • Separate document for translation data (format and contents defined by TC).
  • Use the "label" approach to associate translation with the original text. (Using the text itself is prone to breakage. Using JSON pointer means accepting array indices in the pointer, which is also prone to breaking.)
  • Define a property in the schema that puts an label near what needs to be translated, so that translation can refer to those labels. To make that even better, define defaults - easy ones like "document_title", but also "vul-CVE-2018-XXXX-note-description", "vul-CVE-2018-XXXX-note-summary", where the latter can work if the defaults are unique. Where the labels are not unique, or would require defaulting to array indices, require the label property to have a unique value within the document.





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]