OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

topicmaps-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: AW: [topicmaps-comment] multilingual thesaurus - language, scope,and topic naming constraint


Title: AW: [topicmaps-comment] multilingual thesaurus - language, scope, and topic naming constraint

I never understood why XTM uses scope for languages.
ISO/IEC FCD 13250 does not recommend that.
XML itself has defined the xml:language attribute, i.e. see:
<<
[Definition:]   language represents natural language identifiers as defined by [RFC 1766]. The ·value space· of language is the set of all strings that are valid language identifiers as defined in the language identification section of [XML 1.0 (Second Edition)]. The ·lexical space· of language is the set of all strings that are valid language identifiers as defined in the language identification section of [XML 1.0 (Second Edition)]. The ·base type· of language is token.

http://www.w3.org/TR/xmlschema-2/#language
>>
The scope of GEMET is Environmental Protection, and not a bunch of languages.

So you might use:

<basename xml:lang="en">economics</basename>
etc.

I know this does not conform with XTM - but sorry to say using scope for language does not conform with XML :-(.
Bernard's sample shows the problems very well. They would disappear when we use xml:language instead of scope.
This would also modify the merging rule: ... having the same basename in the same scope ... and the same language!
(allthough even this does not really work because of homonyms)


Thomas Bandholtz
XML Competence Center
SchlumbergerSema
Sema GmbH
Kaltenbornweg 3
D50679 Köln/Cologne
++49 (0)221 8299 264
 

-----Ursprüngliche Nachricht-----
Von: Bernard Vatant [mailto:bernard.vatant@mondeca.com]
Gesendet: Donnerstag, 31. Januar 2002 19:07
An: topicmaps-comments
Betreff: [topicmaps-comment] multilingual thesaurus - language, scope,
and topic naming constraint


Folks

I need a little help from my friends here ...

I'm currently working on GEMET, a multilingual thesaurus published by the European
Environment Agency (over 9000 terms in 18 european languages, with references to hundreds
of sources ...) and trying to provide an XTM version ... a challenge ...

I stumbled on a case where different descriptors have the same name in some languages, and
different ones in some others.
For example, compare the two following descriptors, and their names in six languages given
by the thesaurus.

topic 1: "The social study of the production, distribution, and consumption of wealth."

DAN : økonomi
DUT : economie
ENG : economics
FRE : science économique
GER : Ökonomie
SPA : economía

topic 2: "The system of activities and administration through which a society uses its
resources to produce wealth."

DAN : økonomi
DUT : economie
ENG : economy
FRE : économie
GER : Wirtschaft
SPA : economía

It looks like english, german and french makes the difference, whereas dutch, danish and
spanish clearly don't ... although I'm pretty sure they do distinguish the concepts
(social science vs economical system) but it does not show in the names the Thesaurus
provides.

If I use languages as scopes - which is usual - how will the topic naming constraint
apply? Should my TM engine merge topic1 and topic2, because they have the same name in the
scope "SPA"? Does not make sense ...

Suggestions?

Bernard


----------------------------------------------------------------
To subscribe or unsubscribe from this elist use the subscription
manager: <http://lists.oasis-open.org/ob/adm.pl>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC