OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

topicmaps-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [topicmaps-comment] multilingual thesaurus - language, scope,and topic naming constraint


Thanks to all who tried to answer, both on this list and through private communications.

Now let me expose what I found out yesterday night - just after switching off the
computer - with that delicious feeling you have when a long searched solution suddenly
appears obvious and crystal clear, just because you have, at last, looked at it the right
and simple way, and all the previous attempts look awkward and far-fetched.

But, be patient. A bit of history. Last year, I was investigating that question with
Seruba research team, unfortunately swept from the scene by economical constraints. The
solution I had suggested at the time was to consider terms in different languages as n
distinct topics, independent from the abstract descriptor, itself considered topic n+1.
And then link those guys together through associations, asserting something like:
"This topic is an abstract descriptor, representing an abstract concept, independent from
any language. Those topics represent the term used in those languages to represent this
descriptor concept".
In putting the concept and the terms on different levels of topics, we had a technical way
to manage synonymy and polysemy. But, like solutions proposed by Kal or Tom, that was only
a stealth, and I remember one of Seruba's linguists, very skeptical about it, keeping
saying to me "It works, but it does not make sense!"

And he was right! The only sustainable viewpoint is that there is no such thing as a
*concept independent of its representation by a term in a certain language*. Every
attachment of a term to a concept is always asserted in the scope of a certain language,
and every other language conveys a slightly or radically different view of the world and
organisation of concepts, and that's why lingual diversity is so precious, and translation
so difficult ...

So we have to go back to basics: one subject = one topic.
(DAN : økonomi), (DUT : economie), (ENG : economy), (FRE : économie), (GER : Wirtschaft),
(SPA : economía) convey a priori six different concepts and views of the world, that
someone familiar with all those languages could certainly feel, even if the differences
are subtle. Hence they are six different subjects, and therefore have to be represented by
six different topics. They are not six names of the same topic in different scopes, and
definitely not variants.
And they are not even representations of a same descriptor in different languages. The 7th
topic, standing in the middle of nowhere outside of any language scope, does not make
sense, because it has no meaningful subject. Note that if you give a definition of the
descriptor, you always give it in some default language ...

So what is a descriptor, putting together those six concepts for the purpose of
cross-language communication and translation?
What do you do when you gather topics? Obvious - you build an association. And what is the
scope of that association? The scope of the language viewpoint from which you assert this
association, that means the default language of the thesaurus ...
This association asserts that those topics can be considered as "equivalent", allowing a
translation which makes sense, maybe in a certain scope. Note that the scope is not on the
names, but on the association. And that the associations are not necessarily the same if I
stand from another language viewpoint. So if I edit the thesaurus with a different default
language, I will certainly have to change the set of associations.

That approach is deeply respecting the diversity of *concepts* conveyed by the different
languages. All previous approaches are in fact killing the linguistic diversity, if you
look at them closely, because the default language of the descriptor imposes the set of
concepts, and the other languages are to find willy-nilly a name for it.

And this is really enabled by the topic map representation.

Think about it. I've got to put all that in XTM now.

Regards

Bernard



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC