[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: subFs value and spaces (item 142) - not an official ballot, but please vote (straw poll?)
Regarding: item 142: https://lists.oasis-open.org/archives/xliff-comment/201310/msg00031.html - I guess it comes down to (1) keeping the subFs and adding a delimiter between attribute name/value pairs, or (2) eliminating @subFs and adding escaped XML to @fs.
The definition of subFs says:
The subFs MUST only be used to carry attribute name/value comma-delimited pairs for attributes that are valid for the HTML element
identified by the accompanied fs attribute.
Example: fs:fs="img" fs:subFs="src,smileface.png"
It is unclear to me if you can have more than one pair of name/value per subFs. I assume you can because a) the definition uses
plural here with "the subFs" (so: one subFs with many pairs); and b) it wouldn't make sense to restrict attributes to a single one.
But it should be a lot clearer.
Also the example show that the delimiter comma is used to separate the two parts of a pair, but what is the delimiter between pairs?
If I assume it is space, then there is no ways to define a value containing a space since only \ and , are escaped.
Overall I think it would be a lot simpler to have only one fs attribute that hold the full element to use. Is there a reason why
You are asking for us to eliminate @subFs and just put the whole element, including the attribute name/value pairs(s) in the @fs.
I think when we debated this back when I wrote the module, that same idea was proposed, and ultimately voted down, in favor of the @subFs method.
I do not have all the details of that debate fresh in my mind, nor have I researched the prior debate much. But no doubt my broken-record-objections to escaping XML were part of it.
So in crafting my counter-proposal to dropping the @subFs, I recalled the idea of delimiting each name/value pair with a backslash (\). This because the spec already says to escape “,” and “\” with a backslash, and we say to use a comma to separate attribute name from value.
Let’s call this proposal (1).
<ph id=”p1” fs=”img” subFs=”src,smile.png\alt,My Happy Smile\title,Smiling faces are nice” />
Resolves to this:
<img src="" alt=”My Happy Smile” title=”Smiling faces are nice” />
Pros: as long as there are no escaped commas or backslashes in the value, this is quite easy to parse (I tried with XSLT, Perl, and Java).
But if you have something like this:
<ph id=”p1” fs=”img” subFs=”src,c:\\docs\\images\\smile.png\alt,My Happy Smile\title,Smiling faces\, are nice” />
And want this:
<img src="" alt=”My Happy Smile” title=”Smiling faces, are nice” />
You add complexity.
Cons: when you encounter an escaped comma or backslash it gets very complex (but doable) to parse the string. I was eventually able to do it with XSLT, but it took two call-templates, and a lot of string parsing with XPath expressions.
So let’s revisit your idea to eliminate @subFs and (sorry if this is an unfriendly term) overload @fs, call it proposal (2).
We could do this (putting aside my disdain for escaped XML for now):
<ph id=”p1” fs=”<img src="" alt=’My Happy Smile’ title=’Smiling faces, are nice’ />” />
Pros: one less attribute to parse.
Cons: I think all the complexity is still there. It is just not as easy to spot. Plus we are escaping XML. This is very unfriendly for XML processing (XSLT).
There are ambiguities. In the example above I substituted double quotes for single quotes. But what if the attribute had an apostrophe, like title=”Smiling faces, can’t be denied”? I suppose we could propose escaping single quotes with ' or U+0027, or something, but can we guess what all the escapes that need to be specified are? I could probably come up with other tricky use cases that are every bit as stumping.
This is not an easy one to solve. But I vote for (1).
So to Yves, and please, to others in the TC (Yves and usually find ourselves in a binary debate on the topic of escaping XML – I think we would both welcome fresh points of view), do you vote for (1) or (2), or some even better alternative not yet considered?
I will support the winner of the straw poll, and not gripe if my preference does not prevail.