OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [xliff] Fragment Identification


Yves, inline

Dr. David Filip
=======================
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
telephone: +353-6120-2781
cellphone: +353-86-0222-158
facsimile: +353-6120-2734


On Mon, Dec 16, 2013 at 11:22 PM, Yves Savourel <ysavourel@enlaso.com> wrote:
Hi David, Dave, all,

>> Mmm... the registration aspect is a bit problematic:
>> Aside from having to do the registration,
>
> The registration simply is that each module (eventually extension)
> that has ids must declare their 2-5 nmtoken that combined with some
> core syntactic device (such as / or ~ propsed by me or = proposed
> by you)

Declare where?
In their spec, each 2.0 module has now a section (as part of the solution that have proposed) that contains their prefix
I assume extensions will have their documentation shared at least among those who support them, it would be a good idea for them to have some section upfront similar to our module definitions, where they declare their namespace and prefix..

I'm Tool ABC performing a simple task of checking that all <mrk> elements with a reference actually point to something that exists,
That task is not simple and you can actually only perform it for modules or extensions that you support 
or I'm Tool XYZ doing some clean-up and removing all annotations with references pointing to nothing.

I have this ref to look at: #f=f1/foo=23"

- I can guess foo refers to a module or an extension,
- but how do I know the namespace of the elements where I need to search for id="23"?

You know that it is a module or extension on f1. You cannot be sure what is the proper namespace without supporting the foo module or extension


> For all published modules these prefixes are part of the spec, so
> no problem at all

- Tool XYZ is core only and doesn't know anything about modules (aside that they have a namespace URI that starts with a specific
pattern).
- Tool ABC is based on specification 2.0 and doesn't know anything about the new 2.2 foo module.

Both have a problem.


If you know it is an extension, you can eventually kill it if it causes you trouble, that is the meaning of SHOULD preserve, i.e. preserve unless it's causing trouble.
We could add a PR saying that references to modules MUST be kept and references to extensions SHOULD be kept.
 

BTW both mtc and gls are now designed to reference using their own ref pointing to core only, so they are not likely to cause trouble 

>> there is the issue of how does a tool know which prefix corresponds
>> to which extension?
>
> Only people who support the extension will know that.
> Others will know that it is an extension prefix, because it
> complies with the extension prefix syntax, e.g
> /foo or foo=
> And they will happily ignore it

As shown above Tool ABC and Tool XYZ cannot happily ignore it and do not know about the corresponding module/extension.
If the tool wants to perform the cleaning task on a 2.2 Document it should be aware of all 2.2 module prefixes. Otherwise it can only clean 2.0 files.
It can eventually kill extensions that are not registered by 2.2 



>> It has to somehow keep up with the registry. It's doable but it
>> starts to get very complicated.
>
> No need to keep registry of extensions, only module extensions
> are guaranteed to resolve 

I'm probably not understanding what you mean by "only module extensions are guaranteed to resolved".

sorry that is a typo, I mean module prefixes, module prefixes are guarenteed to resolve becuase they werer published by the XLIFF TC authority 
If there is no registry for extensions,
I am not opposed to having a registry, such registry would be probably part of maintaining the mime type registration 
how do tool know which prefix are used by other tools?
If there is no registry, you obviously need to know the extension specification 
How do they know which prefixes are used by newer modules that didn't exist in the spec that existed when they were build?
You obviously cannot support a 2.2 module if you are a 2.0 tool, core only or not.
If you want to be a Modifier capable of cleaning 2.2 files, you will need to update to 2.2 to be aware of all protected prefixes. 



>> I was thinking about using the namespace prefix used for
>> the module/extension in that document.
>
> In my proposal I use the default module prefix as the nmtoken part
> of the module prefix

That was a good first step.
But the prefix used in the specification is not necessarily the prefix used in the real XLIFF documents. Actually nothing even
prevent the same namespace to use different prefixes within the same document.
Yes, and that is why the fragment identification prefix is specified separately. 


> The scope is what the module extension id attribute defines
> the gls id scope is each <glossary> element 
> so locally you can reference #/gls~1
> This points to the <glosEntry> or <translation> with
> gls:id="1" in the same <unit>
> If you wanted to points to a <glossEntry> or <translation>
> in another unit, which I strongly believe should be forbidden
> (I only allowed the option in my proposal because everyone
> seemed to be eager to have it)* you would need to go
> #1~2~/gls~1
> This is pointing to <glossEntry> or <translation> with
> gls:id="1" within <unit> with  id="2", within <file>
> with id="1"

So #/f=1/u=2/gls=1, I guess that would work as long as we can come up with a solution for the prefix for the modules/extensions.
I think that we have the solution, the only point of contention is if we register them somewhere apart from the module definition itself. 

we talked before about publishing extensions through TC and in fact some 1.2 extensions have been published like this. I am not against registering 2.x extensions, just worried about the overhead with maintaining it.


> If you wanted to points to a <glossEntry> or <translation>
> in another unit, which I strongly believe should be forbidden

I assume you are talking about the general case of an annotation pointing outside the unit where it exists here.

I think there will be cases when people will come up with modules/extensions that require such pointing.
They won't if such pointing is prohibited
They will if we allow it, and that will be the big pain in the neck that will force you to keep the whole XLIFF file in memory 
The TBX case of ITS Terminology annotations created by Tilde (here: http://taws.tilde.com/xliff) is an example of that.

I mean having a TBX extension per file and pointing to it within the same file is kind of OK. You can after all ignore it, if you do not support the TBX extension

What I am worried about is pointing to original data or modules in other units or worse in other units in other files

In my original proposal I suggested to forbid this. As this is the only way how to stay streaming friendly and ensure that internal references stay within the same unit or at least file 

Also it is OK to point to external resources, like terminology servers etc.

Cheers,
-yves



---------------------------------------------------------------------
To unsubscribe from this mail list, you must leave the OASIS TC that
generates this mail.  Follow this link to all your TCs in OASIS at:
https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]