OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

topicmaps-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [topicmaps-comment] Challenge Part 1 - Variable Values


At 19:30 19/11/2001 +0100, you wrote:
>Hello folks
>
>A challenge in two parts, to try and prevent the community to fall
>into some long deep winter sleep.

I was looking forward to post XML2001 hibernation! No time to sleep right 
now... ;-)

>Part 1. What shall we do with variable values, and singularly
>numeric values?
>
>Let's take the following assertion, and try to turn the information it
>contains into a topic map representation.
>
>"By the end of 2001, France Telecom will control more than 30% of
>the Swedish phone market"
>
>This is no academic exercise. Mondeca works with a partner on
>data mining tools, and that is the kind of things we have to transfer
>on the fly in the data base, without losing any critical information.
>We think this kind of situations should be addressed in some
>standard or at least consensual way, that's why I put this on the
>table.
>
>We can create some topics and their classification:
>
>"France Telecom" (instanceOf "company")
>"Sweden" (instanceOf "country")
>"phone market" (instanceOf "market sector")
>
>And create a simple association linking those three guys,
>asserting that FT is present in the phone market in Sweden.
>(So far, so good)
>
>-- What shall we do with "by the end of 2001"?
>Create an ad hoc temporal topic,
>and scope the association with it?
>Doable but not very clean. And how will I merge that with other
>infos using "2001Q4" or any other time appellation?
>
>-- And what about "more than 30%"? It sounds silly to create a
>topic "more than 30%".
>Assuming we do that, how do we use it? As another member in the
>association? But what does that mean? When processing this
>topic, we'll catch together all associations where "something is
>more than 30% of some other thing". Speak about semantics ...
>
>The only proper way to do it seems to reify the association, and
>attach to it this "more than 30%" as an occurrence of a definite
>type, through a <resourceData> element.
>
>Bottom line(s).
>
>1. Is treating values through <occurrence> and <resourceData> to
>be systematic for any kind of numeric assertion?
>2. Should not there be some annex to the spec recommending
>standard practises for handling numeric values?

I think that the approach of reifincation that you suggest is a valid one, 
but I also can think of an alternative way. The association that you are 
expressing here does not fully capture all elements of the statement (as 
you have pointed out). I think that in this case it is probably better to 
make a richer association to fully capture the statement. This kind of 
statement is the kind of statement made by analysts all the time.

Some Event X will occur in Timeframe Y.

Lets call this association type "AnalystPrediction"

Examining further, the player X must somehow express : "France Telecom will 
capture > 30% of the telecoms market in Sweden". So here is another 
association of type "CaptureMarketShare" with three players
Company X will capture Share Y of Market Z

Now, I would argue that "the telecoms market in Sweden" is a single topic 
in its own right - "Swedish Telecoms Market" (rather than an association 
between a vertical market segment and a geographical area". If you choose 
to see it differently, then Z must be an association, but I'm happy with it 
as a topic for my purposes ;-)

The topic Y is a quantitative topic which leads us into some hot water. 
However, I would think that one possible way would be to simply create a 
topic for "30%", and make it play a role of "MinimumShare" in a 
"CaptureMarketShare" (we might then add a role of "MaximumShare" to our 
"schema" for the topic map, to allow the expression of ranges. Finally we 
are left with the question of representing the datum '30' (or '30 
percent'). As this is a point on a range, perhaps you could consider HyTime 
range addressing suitably munged into URN as a subject ? (Actually, that 
thought leaves me somewhat cold...;-). However, I do think that it might be 
useful to express a numeric or range value as the subject of a topic. 
Thinking further, if there was a standard way of doing that, it is 
something which processing software might be able to take advantage of in 
order to better index such topics (and so answer questions such as "What 
predictions of market share capture for France Telecom lie in the range 
10-50% ?").

Next we have the issue of attribution of the statement...I think it is 
probably clear to you where my mind is leading me on that...but I'll save 
it for a separate reply...

Cheers,

Kal

---------------------------------------------------
Kal Ahmed
XML and Topic Map Consultancy
e: kal@techquila.com
w: www.techquila.com
p: +44 7968 529531
---------------------------------------------------

1) Reify the association:
By reifying the three-way association, we can, as you suggest, attach an 
occurrence of type "market share" (for example).

>Cheers
>
>Bernard
>
>
>Bernard Vatant - Consultant
>bernard.vatant@mondeca.com
>Mondeca - "Making Sense of Content"
>www.mondeca.com
>
>
>----------------------------------------------------------------
>To subscribe or unsubscribe from this elist use the subscription
>manager: <http://lists.oasis-open.org/ob/adm.pl>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC