OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff-omos message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [xliff-omos] Modules and extensions


I’m afraid there are quite a few elements/attributes with namespaces in XLIFF.

Sure, but what I meant is a bit different: how frequent are JSON properties required to be fully-qualified names? 5%, 10% or 50% of the property names? Only property names associated with modules and extensions are FQnames, I suppose. In other words, how much JSON text would be saved using a mapping table in the JSON doc with the use of prefixes instead of “URI name”?

Maybe the prefixes might not be so bad: the prefixes for the modules are pre-defined and reserved, so for a glossary gls_glossary (with _ or whatever separator decided upon) would not be confused with another element in another namespace. Essentially the mapping table would be to get the version of the module.

IMHO this still adds the complexity of resolving prefixes with URIs using a mapping table. Implementing a non-standard namespace scope resolution algorithm is undesirable as was pointed out earlier for various reasons

Namespaces are pre-defined and reserved (in the proposed/draft standard) as you state, meaning there is no need for an “in-flight" JSON mapping table in the JLIFF doc if each namespace has a unique short id such as “gls” that can be used directly with a property name e.g. gls_glossary. The prefix bindings are registered elsewhere (not as tables in the JLIFF doc).

As for extensions, while they are not predictable, we also have to remember that they may be registered too, so they should have some sort of predictability too.

Perhaps extensions can use FQNames (without a table).

JSON APIs can ignore unsupported extensions. Extensions of the form “URI name” are very easy to identify.

Summary

1. Modules using unique pre-defined prefixes for property names of the form “prefix_name” with “prefix” registered. The benefit is that “prefix_name” is easy to parse and process by JSON APIs.

2. Non-registered extensions may use unique URIs for property names of the form “URI name”. “URI name” is extensible and the naming is unambiguous (assuming a non-URI separator such as spacing is used).


  Dr. Robert van Engelen, CEO/CTO Genivia Inc.
  voice: (850) 270 6179 ext 104
  fax: (850) 270 6179
  mobile: (850) 264 2676
  engelen@genivia.com

On May 21, 2017, at 7:05 PM, Yves Savourel <ysavourel@enlaso.com> wrote:

 
I’m afraid there are quite a few elements/attributes with namespaces in XLIFF.
Using the full qualified names would probably be a hindrance in some cases (i.e. _javascript_).
 
Maybe the prefixes might not be so bad: the prefixes for the modules are pre-defined and reserved, so for a glossary gls_glossary (with _ or whatever separator decided upon) would not be confused with another element in another namespace. Essentially the mapping table would be to get the version of the module.
As for extensions, while they are not predictable, we also have to remember that they may be registered too, so they should have some sort of predictability too.
 
Cheers,
-yves
 
From: Robert van Engelen [mailto:engelen@genivia.com] 
Sent: Friday, May 12, 2017 9:45 AM
To: Chase Tingley <chase@spartansoftwareinc.com>
Cc: Phil Ritchie <phil.ritchie@vistatec.com>; Yves Savourel <ysavourel@enlaso.com>; XLIFF OMOS TC <xliff-omos@lists.oasis-open.org>
Subject: Re: [xliff-omos] Modules and extensions
 
The more I think about, the more worried I get about working with serialization frameworks if we do some sort of homegrown namespace resolution system.
 
I share your concerns. Also, I believe JSON-LD has disadvantages with respect to serialization frameworks. And introducing our own prefix-URI binding mechanism to define XML-like QNames is not preferable to support existing frameworks.
 
RFC3986 https://tools.ietf.org/html/rfc3986 section 2 says:
 
"A URI is composed from a limited set of characters consisting of
   digits, letters, and a few graphic symbols.  A reserved subset of
   those characters may be used to delimit syntax components within a
   URI while the remaining characters, including both the unreserved set
   and those reserved characters not acting as delimiters, define each
   component's identifying data."
 
Perhaps an ad-hoc name qualification mechanism can be used. Since JSON property names are strings, URIs can be easily embedded without breaking JSON parsers.
 
For example, the space character U+0020 suffices to separate a URI from an unqualified name in a string that contains both:
 
"urn:oasis:names:tc:xliff:glossary:2.0:glossary glossary"
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^
                      URI                         name
 
JSON property names are strings, so this is perfectly valid. Also, using the space character may also be more visually appealing than a character that is graphically rendered such as ^ (hat). I also suspect that spaces are forbidden in JLIFF names, preventing any ambiguity about the placement of the space.
 
Perhaps the downside is that JSON rendition with fully-qualified names is more verbose than XML QNames. I suppose fully-qualified names will not be prevalent in JLIFF so JLIFF fragments do not blow up in size compared to XML, but could be wrong.
 
For hard-code JLIFF serializers, the property names will be a bit more cumbersome to use as a consequence e.g. obj[“URI name”] compared to obj.name (and obj.prefix$name if homegrown prefixes are introduced).
 
 
  Dr. Robert van Engelen, CEO/CTO Genivia Inc.
  voice: (850) 270 6179 ext 104
  fax: (850) 270 6179
  mobile: (850) 264 2676
  engelen@genivia.com
 
On May 10, 2017, at 7:47 PM, Chase Tingley <chase@spartansoftwareinc.com> wrote:
 
The more I think about, the more worried I get about working with serialization frameworks if we do some sort of homegrown namespace resolution system.
 
For example, say we take the approach of listing IRIs at the top of the document, so we have something vaguely like:
 
{
   "prefixes": { "gls" : "urn:oasis:names:tc:xliff:glossary:2.0:glossary" },
   "units": [
      {
        "subunits": [ .... ],
        "gls:glossary": { ... }
      }
  ]
}
 
But this means that there is no fixed name for the glossary data.  It could be "gls:glossary", or "my:glossary", or "foo:glossary", depending on how the prefix is defined.  How easily can serialization libraries handle this?  The standard approach for most of them is to declare bindings between node names and implementation classes -- but with no fixed name, this doesn't work.  With enough effort, in some libraries It's probably possible to (for example) inject a custom evaluator class into the mapping process to make these decisions on the fly based on what prefixes have been declared, but that's a lot to ask a requirement of a correct implementation.  It may not even be possible in some serialization libraries.
 
 
On Wed, May 10, 2017 at 6:56 AM, Phil Ritchie <phil.ritchie@vistatec.com> wrote:
 
I really don’t like the idea but for sake of at least investigating the possibility, it is surprising to me that the json serialization library I am using (Newtonsoft) in .NET will read and write the full urn version below.
 
Phil
 
 
Phil Ritchie
Chief Technology Officer
 | 
Vistatec
Vistatec House, 700 South Circular Road,
Kilmainham, Dublin 8, Ireland.
Tel: 
 | 
Direct: 
Email: 
 | 
ISO 9001
 | 
ISO 13485
 | 
ISO 17100
Think Global
 
From: xliff-omos@lists.oasis-open.org [mailto:xliff-omos@lists.oasis-open.org] On Behalf Of Yves Savourel
Sent: 04 May 2017 03:11
To: 'XLIFF OMOS TC' <xliff-omos@lists.oasis-open.org>
Subject: [xliff-omos] Modules and extensions
 
Hi all,
 
Implementing the Glossary module for JLIFF made me wonder about how we can have a common representation of modules (and extensions).
Some implementations will not support a module like Glossary, but they still have to read JLIFF input that will have glossary entries. How can we accommodate both?
 
Maybe we need to have the namespace identifier for the module/extension with the name, so we would have:
 
             "urn:oasis:names:tc:xliff:glossary:2.0:glossary": [
                {
                  "urn:oasis:names:tc:xliff:glossary:2.0:id": "ge1",
                  "urn:oasis:names:tc:xliff:glossary:2.0:term": {
                    "urn:oasis:names:tc:xliff:glossary:2.0:text": "Term text",
                    "urn:oasis:names:tc:xliff:glossary:2.0:source": "Term source"
                  }
                },
                {
                  "urn:oasis:names:tc:xliff:glossary:2.0:term": {
                    "urn:oasis:names:tc:xliff:glossary:2.0:text": "hot"
                  },
                  "urn:oasis:names:tc:xliff:glossary:2.0:translations": [
                    {
                      "urn:oasis:names:tc:xliff:glossary:2.0:text": "hyt",
                      "urn:oasis:names:tc:xliff:glossary:2.0:source": "Google"
                    }
                  ]
                }
              ]
 
Certainly not great.
Another way could be to use the prefixes:
 
              "gls:glossary": [
                {
                  "gls:id": "ge1",
                  "gls:term": {
                    "gls:text": "Term text",
                    "gls:source": "Term source"
                  }
                },
                {
                  "gls:term": {
                    "gls:text": "hot"
                  },
                  "gls:translations": [
                    {
                      "gls:text": "hyt",
                      "gls:source": "Google"
                    }
                  ]
                }
              ]
 
With somewhere in the JLIFF input a map of the namespace identifiers and their prefixes.
It’d be better, but still not great. For example I’m not sure what the ‘:’ in the names would do in _javascript_. And I’m not sure having to lookup prefixes will be very easy.
 
Thoughts?
 
Thanks,
-yves
 
Yves Savourel
Localization Solutions Architect | ENLASO®
4888 Pearl East Circle | Suite 300E | Boulder | Colorado 80301
t: 
303.945.3759 | f: 303.516.1701
An ISO 9001:2015 certified company
 
Confidentiality Notice 
The information in this transmittal may be privileged and confidential and is intended only for the recipient(s) listed above. Any review, use, disclosure, distribution or copying of this transmittal, in any form, is prohibited except by or on behalf of the intended recipient. If you have received this transmittal in error, please notify me immediately by reply email and destroy all copies of the transmittal.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]