Hi all,
I started an ITS module implementation relying on my generic ITS processor. See the processed files here
external-rules.xml contains the rules, currently only for text analytics. inputfile.xml is an XLIFF 2.1 input file, currently with ITS Text Analytics information. The output is as a list of XPath expressions in nodelist-with-its-information.xml and as inline annotations in output-inline-annotation.xml
The output shows one issue which we had discussed before, see below, taken from output-inline-annotation.xml
<source>
<itsAnn xmlns=""/>
<sm id="sm1"
type="itsm:generic"
itsm:taClassRef="http://nerd.eurecom.fr/ontology#Place"
itsm:taIdentRef="http://dbpedia.org/resource/Arizona">
<itsAnn xmlns="">
<elem>
<taClassRefPointer xmlns:xlf2="urn:oasis:names:tc:xliff:document:2.0"
xmlns:its="http://www.w3.org/2005/11/its"
xmlns:datc="http://example.com/datacats"
itsm:taClassRef="http://nerd.eurecom.fr/ontology#Place"/>
<taIdentRefPointer xmlns:xlf2="urn:oasis:names:tc:xliff:document:2.0"
xmlns:its="http://www.w3.org/2005/11/its"
xmlns:datc="http://example.com/datacats"
itsm:taIdentRef="http://dbpedia.org/resource/Arizona"/>
</elem>
</itsAnn>
</sm>Arizona<em startRef="sm1">
<itsAnn xmlns=""/>
</em>
</source>
With the ITS rules file, „sm“ is annotated to have the text analytics information. But it is actually the content between sm and em that should be annotated. I don’t know how to resolve this. Maybe we should add to the ITS module the constraint that extends general ITS processors: if the selected element is XLIFF sm, apply the ITS information to the next em which corresponds to sm, via the startRef attribute. This would be a small burden on the ITS processors, but would greatly simply the creation of the ITS/XLIFF rules file.
Thoughts?
Best,
Felix