[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes)
Ryan’s examples demonstrate some of the many theoretical uses of the Notes element. Until now, I haven’t seen any guidance/caution regarding how the elements
should be used, and Ryan’s suggestions were intended to provide better organization and structure when including multiple notes. I think the
usage of Notes is a different matter; whether non-translatable information (e.g. instructions) is included seems to be a matter for the XLIFF creator. I don’t think it would be possible for us to restrict what data is included in a Note.
As we identified previously, there is a risk that Notes could indeed become overloaded with information, given their similarity to metadata, but realistically,
it’s difficult to mitigate for this. Thanks, Kevin. From: xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Helena S Chapman Sorry I missed this discussion quite a bit. Remind me again why we are putting in "non-translatable" or "non-localizable" information in XLIFF file for translators in your
examples?
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Dr. David Filip
Ryan, I support adding the core attributes as proposed, plus perhaps the priority [1-10] from Fredriks example
But I also reiterate the request for note to be extensible. This seemed to have Fredrik's support on this thread.
I think that note and inline markers should be extensible as part of the generic annotations design that will allow development of annotation modules, such as ITS mapping..
Cheers dF Dr. David Filip ======================= LRC | CNGL | LT-Web | CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158
facsimile: +353-6120-2734 mailto: david.filip@ul.ie
On Tue, Dec 11, 2012 at 6:24 PM, Ryan King <ryanki@microsoft.com> wrote:
Do we have consensus on this proposal? E.g. adding category, origin, and datetime (or timestamp) attributes to <note>?
Thanks,
ryan
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Ryan King
Subject: RE: [xliff] 1.2 to 2.0 Gaps and Proposals (notes) >> On the other hand having a minimum set for interoperability for ITS unaware tools sounds good.
Agreed. And as stated on another thread…we suggest the list of additional and optional attributes to be origin, category, datetime.
<notes>
<note category=”instruction” origin=”developer” datetime=”2012-11-30T07:43:05Z”>Don’t localize Windows</note>
</note>
Thanks,
ryan .
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Dr. David Filip Thanks for outlining the options, Fredrik, I would be personally OK with note being just extensible. The ITS categories would allow to specify pretty much everything that you would need. First as extension, that should later turn into a module using the same mechanism.
On the other hand having a miniumum set for interoprability for ITS unaware tools sounds good. And as Fredrik pointed out ITS note can be easily mapped on these, so not an issue from here.
Even with the minimum set of core attributes, I still think it should be extensible.. to allow for unforeseen types of annotations..
The only danger is of creating unnecessary clutter if the adoption is minimal.. hard to say what the adoption will be..
Cheers dF Dr. David Filip ======================= LRC | CNGL | LT-Web | CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158
facsimile: +353-6120-2734 mailto: david.filip@ul.ie
On Thu, Nov 29, 2012 at 10:39 AM, Estreen, Fredrik <Fredrik.Estreen@lionbridge.com> wrote:
Hi Ryan, David,
How it would look is dependent on if we add one or more standard attributes to the <note> element or rely solely on third party extensions. First an examples of one of the notes
in your original sample and one showing a potential use of David’s ITS mapping case.
<notes>
<note id=”n1” ms:noteOrigin=”developer” ms:notePriority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>
<note id=”n2” its:locNoteType=”alert”>Make sure to adapt date format when localizing</note>
</notes>
It could be argued that there is a set of very common metadata associated with notes and that we should provide standard attributes in these cases. I’m not sure exactly which,
if any, we should have but the ones I can immediately think of are the kind of information in the above sample plus a date:
* origin / author – Indicate source of the note
* priority – indicate relative importance of a note. Must have strict simple definition. Integer lower is more important than higher for example.
* type / category – indicate what type / aspect of the data or process the note applies to or annotates.
* date – creation or modification date. Which of these it is should be specified.
The good thing about using standard attributes instead of extensions for common properties is of course better interoperability for the data contained. The negative side is that
it adds complexity to the standard which is against one of the goals of the 2.0 work. One part of that is the attempt to reduce the number of seldom or never used constructs to get a leaner core model. A solution that has been discussed before is to have a
more complex comment / annotation module in addition to or extending the core feature. This way we get the same complexity in the core as we would with just third party extensions but with the added value of a fully interoperable path for those that want that
in this area.
If we hypothetically assume we add origin and priority to the core the above example could look like the bellow. Assuming the same mapping for ITS is used as the one proposed
for mapping to XLIFF 1.2 (‘alert’=>1, ‘description’=> 2+) and stored in “priority”.
<notes>
<note id=”n1” author=”developer” priority=”1” ms:noteType=”comment”> This string cannot be longer than 100 characters</note>
<note id=”n2” priority=”1” >Make sure to adapt date format when localizing</note>
</notes>
Regarding the naming of potential core / module attributes I would prefer to use “category” instead of “type” as the former does not convey the level of functional meaning that
the later does for me. It is more ‘just metadata’.
Regards,
Fredrik Estreen
From: Ryan King [mailto:ryanki@microsoft.com]
David or Frederick, can you give us an XLIFF example of how that would look?
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Dr. David Filip Fredrik, all, same as Fredrik, I think that extensibility makes sense here. I agree that the grouping mechanism in the style of mda is not appropriate here and would change the semantics in an undesired way.
Annotations are perfect extension points in general, and besides we need the extensibility here for the its mapping.
Cheers dF Dr. David Filip ======================= LRC | CNGL | LT-Web | CSIS University of Limerick, Ireland telephone: +353-6120-2781 cellphone: +353-86-0222-158
facsimile: +353-6120-2734 mailto: david.filip@ul.ie
On Wed, Nov 28, 2012 at 10:10 AM, Estreen, Fredrik <Fredrik.Estreen@lionbridge.com> wrote:
Hi Rodolfo, Ryan,
I think the intent of the <notes> is lost with the current proposal. The feature is designed so that <notes> is a container for a group of <note>s at a specific level in the
document. Where each <note> is one annotation / comment in itself. The suggested change transforms that so that the <notes> element becomes the entity describing one note, with <note> describing specific pieces of metadata related to that note. The ID is intended
to be used to refer to the note from other places such as from <mrk> elements in the inline content, so overloading it to be the type of data would cause additional problems.
I think the initial model is much easier to work with and more clean as it contain all note related information in one sub tree per document level where notes are allowed. Adding
attributes to the <note> element is in my opinion the best way to go. If we should have more standard attributes or if a processor is free to use the third party namespace extension mechanism to add them is another question. Depending on how simple we want
to keep the basic notes feature it could be either or a mix of the two methods.
Although I’m not a fan of the third party extensions I think this is a case where they could make sense. And if used for process specific metadata only I don’t see an issue.
Of course there will be no standard way to display them in a UI or report if they are not specified in the standard.
Regards,
Fredrik Estreen
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Rodolfo M. Raya
Still a bad use case that doesn’t justify ruining a good design.
Regards,
Rodolfo
--
From: Ryan King [mailto:ryanki@microsoft.com]
So that our original reason for proposing having more than one <notes> at the extension point does not get obfuscated in all of the replies and “see inlines”, here once again,
is the use case for adding more than one <notes> per extension:
Proposal 4: Add an optional name attribute on <notes> in core and <mds:metadata> module.
We believe it will be typical for content providers to want to group their notes or metadata in meaningful ways. This might be done so that a certain number of notes or bits of metadata can be processed in the same way, or simply grouped and displayed together,
such as in an editor UI. Here are some examples: <notes name="comments"> <note id=”priority”>1</note> </notes> <notes name="instructions"> <note id=”priority”>2</note> </notes> As opposed to something less structured and more difficult to process: <notes> <note id=“comment">This string cannot be longer than 100 characters</note> Thanks, Ryan From: Rodolfo M. Raya [mailto:rmraya@maxprograms.com]
Please don't ruin te design for <notes>. Only one should be allowed per insertion point.
Regards, Rodolfo
·
Proposal 2: Be able to specify optional custom values for match type in <mtc:matches>
·
Proposal 4: Add an optional name attribute on <notes> in core (which also means that we need to allow
zero, one or more <notes> in each position in the tree structure) Additionally, it was deemed that we should add Reference Language to the <mtc:matches> module. How do you want to move forward with that? Since the module is already defined in the 2.0 spec,
can I just suggest the method and if you agree, you can fold it into the current module definition? I would propose:
1. That we allow
zero, one or more <mtc:matches> at each extension point, because you might have both recycling and reference language data.
2. Add an optional attribute reference=”yes|no” with no as default. Additionally, PR for a “reference match” would be to allow an xml:lang on the
target different from the document and allow the <source> not to be present as it would be redundant information with the core <source>, e.g. Spanish reference for Quechua might look like this:
<mtc:matches>
<mtc:match reference=”yes”>
<segment>
<target xml:lang=”es-es”>hola mundo</target>
</segment>
</mtc:match>
</match>
I’m not sure if any of these require an electronic ballot. I got the impression from the call that they don’t, but hopefully Bryan or David or someone else from the call will correct that
if false. Please let me know how I can work with you on these.
Ryan
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Ryan King Thanks Yves and David for the valuable feedback. See our comments inline below prefixed with [Microsoft]. As David suggested on another thread, we will add these soon to the
wiki.
From:
xliff@lists.oasis-open.org [mailto:xliff@lists.oasis-open.org]
On Behalf Of Dr. David Filip Commenting inline.. Cheers dF On Thu, Nov 15, 2012 at 8:23 PM, Yves Savourel <ysavourel@enlaso.com> wrote:
Hi Ryan, all,
> .. I don't see anything wrong with this.
I understand the need for the information, but to me, it seems the similarity give you whether a match is exact or not. I believe there is value in decoupling the "percentage" from the "business" type of the match. The number means nothing unless we opt to prescribe a specific variety of (modified) Levenshtein, and I i guess we should not open this particular can of worms..
So I wouldn't see a problem with a sub-type there.
[Microsoft] we definitely advocate decoupling the “percentage” from the “business” type of match as David puts it. And we should not prescribe meaning to the percentage, either.
Costing models built on top of these values will necessarily change from one provider/supplier to the next and as Yves states, possibly from one project to the next. We could very easily have the following (and we do in much of our recycled content):
<match id=”1” similarity=”100.0” type=”tm/xlf:exact”> In the first case, we’ve recycled a candidate which is 100% match, but came from a segment whose state isn’t signed off or final yet, whereas the ice match, in our case, has
the requirement of being 100% and signed off or final. I see the use case and I've seen other cases like this, with Chinese (simplified/Traditional). I agree with Yves's reasons to have this within the match module, which is anyway the alt-trans successor. I guess it does not fulfill the core criteria
[Microsoft] Adding this to the match module would be fine as long as the proper explanatory text and processing instructions make it clear what this data should be used for as
opposed to recycling.
> ... > <notes name="comments"> Sounds reasonable. We'll have to allow several <notes> and <m:metadadat> (I think (but I may be wrong) only one is allowed)) on the extension point. I agree with Yves that a couple of standard attributes should be added to increase interoperability, still I believe that note should be fully extendable, as it is part of the general annotation mechanism and should be able to carry attributes from other
namespaces. [Microsoft] Capturing an author and timestamp on a comment is specific to our needs and thus that example. However, we do see value in being able to apply an author and timestamp
on potentially any piece of data. So a module (as Yves suggests below) that can exists at the same extension points as metadata (and including metadata) might lend itself better to that.
> ... > <segment id=”1” modifiedBy=”translator@loc.com” Here again I'm wondering if a "change track" module may be better? Optional attributes in core are tricky, IMHO It means you do not need to introduce it yourself, if you do not feel so.. But if present it would need to be processed by agents who modify the segment. If it is thinkable that change agents do not update it,
it feels more like a module... [Microsoft] Since we are heading down the same path to MUST preserve modules as well, if we introduce a “change track” module, then user agents would need to preserve it if present, but as for any other processing requirements,
such as updating it, that could be specified as part of the module’s processing requirements. For example: The module MUST be preserved and SHOULD be updated by user agents.
|
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]