[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [xliff-comment] HTML extranction examples
Thank you very much for the pointers.Some I did found and used already, but since it requires a lot of hunting around, my suggestion was add them more prominently somewhere in the specs.It’s true that general principles stay the same in v 1.2 and 2.0, but v2 adds a lot more possibilities, like the originalData, the references, the FS module. One thing is adding more possibilities, the other is explaining how to use them in the best way :)Simone
--Simone ChiarettaMicrosoft MVP ASP.NET - ASPInsidertwitter: @simonech
On 9 May 2017 at 13:39:08, Yves (firstname.lastname@example.org) wrote:
+1 on that. It’s true that there are probably not enough examples in the specification.
Some of them however are using HTML, especially in the section regarding inline codes.
For instance the examples for the sub-flows: http://docs.oasis-open.org/
If it can help, a few other examples can be found in the Okapi Framework implementation.
There are two samples in HTML, with the originals and the XLIFF2 outputs:
The few pointers I can think of, from experience:
- Do not just extract the HTML content into a CDATA section.
- Only the in-line codes should be in XLIFF units (as XLIFF codes), that is: <b> not <p>.
- If possible use sub-flow for text embedded in HTML tags (e.g. alt or title text).
- If possible don’t use <ph/> for paired code, use <pc>…</pc>.
Also, there is the draft version of the old “XLIFF 1.2 Representation Guide for HTML” that is available. It was done for XLIFF 1.2, but most principles are the same for 2.0. You can find it here: http://docs.oasis-open.org/
xliff/v1.2/xliff-profile-html/ xliff-profile-html-1.2-cd02. html
I hope that helps.
I’m implementing an extractor from a CMS and by reading the specifications it’s not super-clear which is the right way to extract a piece of HTML to XLIFF.
I understand that extraction is a very personal and application specific matter so probably not to be standardised in the specs, but it would be helpful to add somewhere, either as notes or even in the test suite examples of how HTML fragments are to be converted into XLIFF.