OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

entity-resolution message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Re: Catalog Requirements

Hi Paul:

I basically support having within our scope, any mapping that brings new 
information to the document parsing and assembly or rules.

I have mixed Public and System entity sets in a catalogue without problems when 
writing proprietary systems, and reckon that it could be a great benefit if 
these were understood in a public catalogue context.

That has looked something like this:

<!DOCTYPE "type" [


<!ENTITY % "set1" PUBLIC "name of set1">

<!ENTITY % "set2" PUBLIC "name of set2">

<!ENTITY % "set3" SYSTEM "name of set3">

%set1; %set2; %set3;

I don't know if the complexity of execution is a consideration of our scope or 
not.  Complexity is a relative issue, because what may be complex in terms of 
system setup may facilitate the ability to process meaningful information-rich 
data communications.  I do think that we should look for the best methodology 
for defining the entities that will pass the best - richest - information.

That indicates that I support the possibility that we may end up with URI>URI 
mapping, because that may end up being the best way to make sure that an entity 
list always stays up to date with industry innovation.  If the right hand side 
of that equation were to be a database of entities entities - like xml/EDI - 
that were always going to be "valid", and were part of an expanding set, then I 
think it useful for us to map to them.  I think that it would be a useful 
addition to the lexicon of XML if these entities were keyworded to indicate 

At the same time, I reckon that there will be a great need to have a mix of 
Public and System entities, especially where there are some bits that are 
proprietary (eg. auto makers have their own entity sets) and public (eg. 
xml/EDI, even ISOnum, etc.) that will need to both be available to adequately 
parse the data/doc instance.  I feel that it is still appropriate to consider 
that PUBLIC and SYSTEM are not obsolete designations in this context.

So, I have a very inclusive view of what the scope of our task is to be.  Even 
if we do extend TR9401 by a bit, ontologically speaking.

David Leland

Paul Grosso <pgrosso@arbortext.com> on 11/20/2000 10:35:29 AM
To: entity-resolution@lists.oasis-open.org
cc:  (bcc: David Leland/LONDON/FINANCIAL TIMES)
Subject: Re: Catalog Requirements

At 18:00 2000 11 19 -0800, Lauren Wood wrote:
>On 16 Nov 2000, Terry Allen wrote:
>> I don't believe I am mixing layers; I believe that the contextual
>> semantics (e.g., SYSTEM vs PUBLIC vs ENTITY) of the socat are obsolete 
>> and that the 2 mapping problems are at exactly the same layer.  But I
>> accept your argument that URI>URI mapping is out of scope for
>> this TC, so I won't press the point.  
>I don't accept this. I think URI>URI mapping is in scope; one 
>example is the href URI on the stylesheet PI for which we have no 
>solution currently (it's not a SYSTEM ID). This is an example of the 
>new XML features which are part of our statement of purpose.

I think we are closer to agreeing on what we want but disagreeing
on vocabulary.

If you want to map the thing in a stylesheet PI, then we should
add a STYLESHEET entry type.  That is functional/contextual.  We
are not mapping arbitrary URIs (which I still think is out of scope).

Maybe an example would help explain what I'm saying.

document instance

<?xml version="1.0"?>
<!DOCTYPE foo PUBLIC "xxx" "yyy" [
<!ENTITY bar SYSTEM "xxx">
<?xml-stylesheet href="xxx" type="text/css"?>
<foo xmlns="xxx">

possible catalog
PUBLIC          "xxx"   "foo.dtd"
SYSTEM          "xxx"   "bar.xml"
STYLESHEET      "xxx"   "stylesheet.css"
NAMESPACE       "xxx"   "foo namespace name"
URI             "xxx"   "huh?"

Note that there are four occurrences of the URI "xxx" in
the document instance and four corresponding entries in
the catalog without there being any ambiguity as to which
entry is applicable in each case.  This is what I'm calling
functional/contextual mapping.  Sure the left hand side
might be a URI (or might not), but this isn't pure, context
free URI>URI mapping which I claim is out of scope in an
entity management catalog (and, in fact, belongs in a lower
layer, because one may well want to remap the URI that comes
*out* of a catalog mapping).

I hope my example makes clear why the last entry in the catalog 
above is ambiguous and problematic, and it isn't needed to address 
anything we really have to do in xmlcat version 1.0.


* Please visit the web site of the Financial Times at:              *
*                         http://www.ft.com                         *
*                                                                   *
* This E-Mail is intended for the use of the addressee only and may *
* contain confidential information. If you are not the intended     *
* recipient, you are hereby notified that any use or dissemination  *
* of this communication is strictly prohibited.                     *
* If you receive this transmission in error, please notify us       *
* immediately then delete this E-Mail.                              *                                    *
*                                                                   *
* postmaster@ft.com                                                 *

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC