Subject: RE: [xliff] IDs - uniqueness for inline only within segments (A)
Further to my previous message, I’ve answered one of my own questions, re: ID values in <source> and <target> (to which the first of David’s business reasons applies) should of course consider inline elements within them. It doesn’t really change my question, though: why not globally unique IDs?
· All IDs for <group>, <unit>, <file>, <segment>, <ignorable>, and inline elements within <source> can be of type xs:ID and globally unique
· IDs for inline elements within <target> can be of type xs:IDREF corresponding to an ID within <source> (and I would waive my philosophical objection to naming the element ID rather than IDREF)
These constraints are easily expressed, and enforceable through the schema. How to manage the uniqueness would be up to the writer agent(s).
David, Yves, all,
I’ve finally caught up on this interesting discussion. I’ve been thinking about this statement:
> There is a long standing consensus in this TC that XLIFF ids cannot
> be of the xsd type XML id for several good business reasons
> 1) duplicity of id between source and target is being used as
> indicating the sameness of the element in source and target
> 2) XLIFF files can be huge and strict enforcement of uniqueness would
> prevent streamed handling of XLIFFs
Perhaps you’ll excuse my naïveté as not having participated in the discussions that led to this consensus. While working with the schemas I had a tacit understanding of the issues, without having considered all of the implications. I also had a bit of philosophical unease with defining attributes named ID that aren’t of type xs:ID. With David’s parsing of the issues surrounding IDs, I see a different approach that may be viable.
I’ll start out by saying that I don’t understand the first point, given that the ID attribute is not present for either <source> or <target>. These elements can occur only within <segment> and <ignorable> (each of which allows but does not require an ID). The correlation between <source> and <target> is enforced structurally. But that actually isn’t relevant to my remaining points.
Let’s consider the perspective of the extracting agent. The extractor is responsible for compliance with five different uniqueness constraints (188.8.131.52). The values for the ID attributes will be drawn from the source content, if possible, or generated by the agent as necessary. Five different sets of values might need to be generated; and the extracting agent might need to use a naming convention, and/or append a sequential number, in order to prevent any one set of ID values from conflicting with the other four.
Note that all of the constraints are satisfied if the ID attribute values are unique for all uses of the ID attribute. The extracting agent could be designed to generate and apply a single set of values, avoiding the possibility of conflict, regardless of the scopes defined in our constraints. Are globally unique ID values a greater demand on the designers of an extracting agent than the five separate uniqueness constraints?
A second consideration is how an XLIFF element is referenced. Currently we would use an ID attribute, of type xs:NMTOKEN. If we have ID attributes of type xs:ID, it would make sense to reference elements by IDs using an attribute of type xs:IDREF (and named IDREF, I would suggest). Global uniqueness is not required for IDREF attributes. We can add the constraint that the IDREF must point to an element with the corresponding ID value (a condition that is not part of the XML schema language).
Could this approach address some of the issues that David has outlined? I’m interested in any and all feedback, fully aware I may be missing something here.
Thanks for this, Yves
I think this one is clear, so I am hereby making a call for dissent and will assume that making inline ids unique within unit consistently throughout Constraints and PRs is approved unless I hear otherwise by the end of this week, Friday COB, PDT.
Dr. David Filip
LRC | CNGL | LT-Web | CSIS
University of Limerick, Ireland
On Tue, Oct 22, 2013 at 6:40 PM, Yves Savourel <email@example.com> wrote:
Hi David, all,