Hi,
To continue the discussion on root content meta-types in JLIFF JSON payloads (“fragment”, “units”, “groups” etc), I am growing more uncomfortable with an approach that requires explicit meta-type information in the payload e.g. using a “type attribute” when in fact the meta-type of the root content is immediately defined by the property name at the root level that we choose for the various possible roots.
For example, the “units” root is used in jliff-example4.json as follows:
{ "jliff": "2.1", "srcLang": "en", "trgLang": "de", "units": [ { "id": "u1", "subunits": [
Since “units” determines the root content of the payload, it is not clear why other meta-type information is necessary. Can someone clarify?
Note that the property names of the roots that we currently support in JLIFF schema 0.9.4 are:
- “files” contains an array of fragments
- “fragment” contains an object with id, original, skeleton, units and metadata
- “groups” contains an array of groups but this is not (yet) defined in JLIFF 0.9.4 (perhaps an oversight? Do we need this?)
- “units” contains an array of “unit” objects
- “subunits” contains an array of “subunit” objects
On a related note, since “files” is an array of fragments, we should perhaps remove the “fragment” root and use “files” instead with a singleton fragment. This improves orthogonality of the model design with only one way to include a single fragment instead of two ways.
Cheers,
- Robert
Dr. Robert van Engelen, CEO/CTO Genivia Inc. voice: (850) 270 6179 ext 104 fax: (850) 270 6179 mobile: (850) 264 2676
OK, then it is now clear to me that I did not fully understand the nature/purpose/use case of different "root" types. :-(
Phil Ritchie | Chief Technology Officer | | | Vistatec |
|
| Vistatec House, 700 South Circular Road, Kilmainham, Dublin 8, Ireland. | | | |
| | Think Global | |
| Vistatec Ltd. Registered in Ireland 268483. Registered Office, Vistatec House, 700, South Circular Road, Kilmainham. Dublin 8. Ireland. The information contained in this message, including any accompanying documents, is confidential and is intended only for the addressee(s). The unauthorized use, disclosure, copying, or alteration of this message is strictly forbidden. If you have received this message in error please notify the sender immediately. |
From: David Filip <david.filip@adaptcentre.ie>
Sent: 23 October 2017 02:00:33
To: Robert van Engelen
Cc: Phil Ritchie; XLIFF OMOS TC
Subject: Re: [xliff-omos] Changes to schema discussed in last meeting.
Hi Robert, Phil,
I agree with Phil that this is undesirable..
This type of repetition shouldn't be allowed.
This is very different from the OM and would not be interoperable with the XML pipeline. You'd need to expand multiple coresponding XLIFF files when going XML. And I don't think we want to go there.
Each JLIFF should have just a single instance of root content..
The values governing what type this instance is should not repeat the names from the lower level, because the reason to introduce the root content characteristics was that the lower level content was ambiguous from the OM point of view.
So this is correct, just that the values to govern the content selection should be and map like this
liff -> content must be "files"
file/fragment -> content should be an unlimited choice group of "groups" and "units"
OR we could introduce "subfiles" (by analogy tu "subunits" which can intermix segments and ignorables) that could intermix group and unit content objects.
group -> content should be an unlimited choice group of "groups" and "units"
OR we could introduce "subgroups" that could intermix group and unit content objects.
Now that I wrote all of the above, it seem to me that we could and should get rid of the top level content property by simply introducing more content types. This makes the content unambiguous from the OM and XLIFF points of view.
We'd have "files" which equals liff in OM, "subfiles" which equals file, "subgroups" which equals group. And "subunits" for unit.
"subfiles" and "subgroups" have exactly the same data model in JLIFF, which is fine because they have the same models in OM. We just call them differently to preserve the OM level, which is critical for switching pipelines..
Cheers dF
|