Re: [xliff-omos] Modules and extensions

Namespaces are pre-defined and reserved (in the proposed/draft standard) as you state, meaning there is no need for an “in-flight" JSON mapping table in the JLIFF doc if each namespace has a unique short id such as “gls” that can be used directly with a property name e.g. gls_glossary. The prefix bindings are registered elsewhere (not as tables in the JLIFF doc).

As for extensions, while they are not predictable, we also have to remember that they may be registered too, so they should have some sort of predictability too.

Perhaps extensions can use FQNames (without a table).

JSON APIs can ignore unsupported extensions. Extensions of the form “URI name” are very easy to identify.

Summary

1. Modules using unique pre-defined prefixes for property names of the form “prefix_name” with “prefix” registered. The benefit is that “prefix_name” is easy to parse and process by JSON APIs.

2. Non-registered extensions may use unique URIs for property names of the form “URI name”. “URI name” is extensible and the naming is unambiguous (assuming a non-URI separator such as spacing is used).

Dr. Robert van Engelen, CEO/CTO Genivia Inc.
voice: (850) 270 6179 ext 104
fax: (850) 270 6179
mobile: (850) 264 2676
engelen@genivia.com

On May 21, 2017, at 7:05 PM, Yves Savourel <ysavourel@enlaso.com> wrote:

Ø  Perhaps the downside is that JSON rendition with fully-qualified names is more verbose than XML QNames. I suppose fully-qualified names will not be prevalent in JLIFF so JLIFF fragments do not blow up in size compared to XML, but could be wrong.

I’m afraid there are quite a few elements/attributes with namespaces in XLIFF.
Using the full qualified names would probably be a hindrance in some cases (i.e. _javascript_).

Maybe the prefixes might not be so bad: the prefixes for the modules are pre-defined and reserved, so for a glossary gls_glossary (with _ or whatever separator decided upon) would not be confused with another element in another namespace. Essentially the mapping table would be to get the version of the module.
As for extensions, while they are not predictable, we also have to remember that they may be registered too, so they should have some sort of predictability too.

Cheers,
-yves

From: Robert van Engelen [mailto:engelen@genivia.com]
Sent: Friday, May 12, 2017 9:45 AM
To: Chase Tingley <chase@spartansoftwareinc.com>
Cc: Phil Ritchie <phil.ritchie@vistatec.com>; Yves Savourel <ysavourel@enlaso.com>; XLIFF OMOS TC <xliff-omos@lists.oasis-open.org>
Subject: Re: [xliff-omos] Modules and extensions

The more I think about, the more worried I get about working with serialization frameworks if we do some sort of homegrown namespace resolution system.

I share your concerns. Also, I believe JSON-LD has disadvantages with respect to serialization frameworks. And introducing our own prefix-URI binding mechanism to define XML-like QNames is not preferable to support existing frameworks.

RFC3986 https://tools.ietf.org/html/rfc3986 section 2 says:

"A URI is composed from a limited set of characters consisting of
digits, letters, and a few graphic symbols. A reserved subset of
those characters may be used to delimit syntax components within a
URI while the remaining characters, including both the unreserved set
and those reserved characters not acting as delimiters, define each
component's identifying data."

Perhaps an ad-hoc name qualification mechanism can be used. Since JSON property names are strings, URIs can be easily embedded without breaking JSON parsers.

For example, the space character U+0020 suffices to separate a URI from an unqualified name in a string that contains both:

"urn:oasis:names:tc:xliff:glossary:2.0:glossary glossary"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^
URI name

JSON property names are strings, so this is perfectly valid. Also, using the space character may also be more visually appealing than a character that is graphically rendered such as ^ (hat). I also suspect that spaces are forbidden in JLIFF names, preventing any ambiguity about the placement of the space.

Perhaps the downside is that JSON rendition with fully-qualified names is more verbose than XML QNames. I suppose fully-qualified names will not be prevalent in JLIFF so JLIFF fragments do not blow up in size compared to XML, but could be wrong.

For hard-code JLIFF serializers, the property names will be a bit more cumbersome to use as a consequence e.g. obj[“URI name”] compared to obj.name (and obj.prefix$name if homegrown prefixes are introduced).

Dr. Robert van Engelen, CEO/CTO Genivia Inc.
voice: (850) 270 6179 ext 104
fax: (850) 270 6179
mobile: (850) 264 2676
  engelen@genivia.com

On May 10, 2017, at 7:47 PM, Chase Tingley <chase@spartansoftwareinc.com> wrote:

The more I think about, the more worried I get about working with serialization frameworks if we do some sort of homegrown namespace resolution system.

For example, say we take the approach of listing IRIs at the top of the document, so we have something vaguely like:

{
"prefixes": { "gls" : "urn:oasis:names:tc:xliff:glossary:2.0:glossary" },
"units": [
{
"subunits": [ .... ],
"gls:glossary": { ... }
}
]
}

But this means that there is no fixed name for the glossary data. It could be "gls:glossary", or "my:glossary", or "foo:glossary", depending on how the prefix is defined. How easily can serialization libraries handle this? The standard approach for most of them is to declare bindings between node names and implementation classes -- but with no fixed name, this doesn't work. With enough effort, in some libraries It's probably possible to (for example) inject a custom evaluator class into the mapping process to make these decisions on the fly based on what prefixes have been declared, but that's a lot to ask a requirement of a correct implementation. It may not even be possible in some serialization libraries.

On Wed, May 10, 2017 at 6:56 AM, Phil Ritchie <phil.ritchie@vistatec.com> wrote:

I really don’t like the idea but for sake of at least investigating the possibility, it is surprising to me that the json serialization library I am using (Newtonsoft) in .NET will read and write the full urn version below.

Phil

Phil Ritchie
Chief Technology Officer
|
Vistatec
Vistatec House, 700 South Circular Road,
Kilmainham, Dublin 8, Ireland.
Tel:
+353 1 416 8000
|
Direct:
+353 1 416 8024
Email:
phil.ritchie@vistatec.com
www.vistatec.com
|
ISO 9001
|
ISO 13485
|
ISO 17100
<image001.jpg>
Think Global
<image002.jpg>
<image002.jpg>
<image002.jpg>
<image002.jpg>

From: xliff-omos@lists.oasis-open.org [mailto:xliff-omos@lists.oasis-open.org] On Behalf Of Yves Savourel
Sent: 04 May 2017 03:11
To: 'XLIFF OMOS TC' <xliff-omos@lists.oasis-open.org>
Subject: [xliff-omos] Modules and extensions

Hi all,

Implementing the Glossary module for JLIFF made me wonder about how we can have a common representation of modules (and extensions).
Some implementations will not support a module like Glossary, but they still have to read JLIFF input that will have glossary entries. How can we accommodate both?

Maybe we need to have the namespace identifier for the module/extension with the name, so we would have:

             "urn:oasis:names:tc:xliff:glossary:2.0:glossary": [
                {
                  "urn:oasis:names:tc:xliff:glossary:2.0:id": "ge1",
                  "urn:oasis:names:tc:xliff:glossary:2.0:term": {
                    "urn:oasis:names:tc:xliff:glossary:2.0:text": "Term text",
                    "urn:oasis:names:tc:xliff:glossary:2.0:source": "Term source"
                  }
                },
                {
                  "urn:oasis:names:tc:xliff:glossary:2.0:term": {
                    "urn:oasis:names:tc:xliff:glossary:2.0:text": "hot"
                  },
                  "urn:oasis:names:tc:xliff:glossary:2.0:translations": [
                    {
                      "urn:oasis:names:tc:xliff:glossary:2.0:text": "hyt",
                      "urn:oasis:names:tc:xliff:glossary:2.0:source": "Google"
                    }
                  ]
                }
              ]

Certainly not great.
Another way could be to use the prefixes:

              "gls:glossary": [
                {
                  "gls:id": "ge1",
                  "gls:term": {
                    "gls:text": "Term text",
                    "gls:source": "Term source"
                  }
                },
                {
                  "gls:term": {
                    "gls:text": "hot"
                  },
                  "gls:translations": [
                    {
                      "gls:text": "hyt",
                      "gls:source": "Google"
                    }
                  ]
                }
              ]

With somewhere in the JLIFF input a map of the namespace identifiers and their prefixes.
It’d be better, but still not great. For example I’m not sure what the ‘:’ in the names would do in _javascript_. And I’m not sure having to lookup prefixes will be very easy.

Thoughts?

Thanks,
-yves

Yves Savourel
Localization Solutions Architect | ENLASO®
4888 Pearl East Circle | Suite 300E | Boulder | Colorado 80301
t: 303.945.3759 | f: 303.516.1701
An ISO 9001:2015 certified company

Confidentiality Notice
The information in this transmittal may be privileged and confidential and is intended only for the recipient(s) listed above. Any review, use, disclosure, distribution or copying of this transmittal, in any form, is prohibited except by or on behalf of the intended recipient. If you have received this transmittal in error, please notify me immediately by reply email and destroy all copies of the transmittal.

xliff-omos message