Hi there,
Since the discussion for intra-source/intra-target
extension points revolves around terminology-/linguistics related examples,
and since we are touching this area also in other contexts, here a couple of
thoughts about the representation of terminology-/linguistics related
information ...
I would favor an approach which some might term
'normalized' (in the sense of database theory), others 'single
source-oriented/non-redundant'. It boils down to the
following:
If you have chunks of information (e.g. all
information related to a terminology concept) than don't repeat it, rather
link to it. Accordingly, Yves' example would be become:
<source>Our guests can appease their spirit of
adventure and itchy
feet by exploring the various islands of our
small <xyz:term
cID="1234ABE34FE">>archipelago</xyz:term></source>.
'cID' abbreviates 'concept
identifier'.
The identifier would be related to terminological
database which is identified by the URN to which the namespace prefix
belongs.
An XML serialization of the concept identified by
'cID' would look like the following:
<xyz:termEntry
cID="1234ABE34FE">
<xyz:term>archipelago</xyz:term>
<xyz:pos>noun</xyz:pos>
<xyz:info>
<xyz:def>Group of
island<xyz:def>
<xyz:pronunciation>"är-k&-'pe-l&-"gO,
"är-ch&-<xyz:pronunciation>
</xyz:info>
</xyz:termEntry>
Using the Open Lexicon Interchange Format (OLIF; see
http://www.olif.net) as the representation for the XML serialization, you
could end up with XLIFF which looks like the following:
<xliff
version="1.1"
xmlns="urn:oasis:names:tc:xliff:document:1.1"
xmlns:xyz="urn:appInfo:Items"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:olf="http://www.olif.net/base/sampleDatabase"
>
<file
original="island.txt"
source-language="en"
target-language="fr"
datatype="plaintext">
<header>
<glossary>
<internal-file>
<olif
OlifVersion="2.0,
February 2002">
<header
CreaTool="CoolTerm"
CreatToolVersion="1.4.3"
OrigFormat="internal"
AdminLang="EN"
CreaDate="20031119091301Z"
CreaId="X">
<dataCatReg>
<subjFieldDCS
DCSType="replacement">CompLingCompany</subjFieldDCS>
</dataCatReg>
<contentInfo>
<quotMarkInfo
QuotMarkRet="some"/>
<langIdUse>region_exception</langIdUse>
</contentInfo>
</header>
<body>
<entry
ConceptUserId="2312">
<mono
MonoUserId="1232">
<keyDC>
<canForm>archipelago</canForm>
<language>en</language>
<ptOfSpeech>noun</ptOfSpeech>
<subjField>general-naturalScience-geography</subjField>
</keyDC>
<monoDC>
<monoSem>
<definition>Group
of islands</definition>
</monoSem>
</monoDC>
<generalDC>
<note>pron:"är-k&-'pe-l&-"gO,
"är-ch&-</note>
</generalDC>
</mono>
</entry>
</body>
</olif>
</internal-file>
</glossary>
</header>
<body>
<trans-unit
id="x">
<source>Our
guests can appease their spirit of adventure and itchy
feet
by exploring the various islands of our small <olf:term
cID="2312">archipelago</olf:term>
</source>.
<target>Notre
visiteurs ...</target>
</trans-unit>
</body>
</file>
</xliff>
The benefits of this ‘link-only’ approach would
clearly start to show if you for example would have the term ‘archipelago’
1000 times in your document …
Best regards,
Christian
-----Original Message-----
From: Yves Savourel
[mailto:ysavourel@translate.com]
Sent: Wednesday, July 28, 2004 12:22
AM
To: xliff@lists.oasis-open.org
Subject: RE: [xliff][follow-up] FW:
Degrees of constraint
> My purpose for using extension points would be to
sprinkle
> terminology data inside the source and inside the
target
> elements, and to interject localization
directives,
> potentially in both as well. So, to address your
question,
> yes, these may introduce some structure, although
I assume
> a lean structure.
So for example (using just a random terminology
namespace, I realize it
would be leaner and different) we could possibly see
something like the
following?
<source>Our guests can appease their spirit of
adventure and itchy
feet by exploring the various islands of our small
<xyz:termEntry
link="1234ABE34FE">
<xyz:term>archipelago</xyz:term>
<xyz:pos>noun</xyz:pos>
<xyz:info>
<xyz:def>Group of
island<xyz:def>
<xyz:pronunciation>"är-k&-'pe-l&-"gO,
"är-ch&-<xyz:pronunciation>
</xyz:info>
</xyz:termEntry>
.</source>
Just to get an idea of what you mean by
"structure".
-ys
To unsubscribe from this mailing list (and be removed
from the roster of the OASIS TC), go to
http://www.oasis-open.org/apps/org/workgroup/xliff/members/leave_workgroup.php.