OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

entity-resolution message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [entity-resolution] SYSTEM identifier use case


Paul wrote: [apologies for the formatting; I had been unsubscribed 
again and copied this from the archives]

    * From: Paul Grosso <pgrosso@arbortext.com>
    * To: entity-resolution@lists.oasis-open.org
    * Date: Fri, 09 Jan 2004 12:13:18 -0600


At 10:00 2004 01 09 -0800, Lauren Wood wrote:
>This use case came up while I was processing the proceedings files 
>for the XML conference. In this case, I have several files, each in 
>their own directory, and the DTD and associated other files in 
>another directory, so we have 
>01-01-01/01-01-01.xml
>01-02-01/01-02-01.xml
>dtd/gcapaper.dtd
>dtd/isoamsa.ent
>etc
>
>Because different people author the XML files, and the system ID in 
>the file always ends up different, with only the "gcapaper.dtd" at 
>the end being the same, I want some way to say "only look at the 
> last bit of the system URI, the gcapaper.dtd, and match that to 
>../dtd/gcapaper.dtd". Unfortunately SAX always sends the absolutized
>URI, and this causes problems (I get too much of the directory path)
>Norm implemented an extension to his resolver which does what I want
> <ext:systemSuffix suffix="gcapaper.dtd" uri="../dtd/gcapaper.dtd"/>
>but of course it isn't widely implemented.
>
>Is this a use case that other people also think is worth covering in
>the spec?

Perhaps.

But I note that in your use case, you don't really care about the
system id at all.  You really want to say "ignore the system id,
and as long as the DOCTYPE name is gcapaper, use dtd/gcapaper.dtd"
and this is precisely what the DOCTYPE entry in TR9401 does.

So perhaps we should add the DOCTYPE entry from TR9401 to XML 
Catalogs.

paul

That would work if I only cared about the DTD, but I also care about 
the entity files (I just left those out for clarity), and I can 
imagine other types of processing caring about other types of files. 
So although adding the DOCTYPE would help some of the problem, it's 
only part of it and a more general solution would be useful for this 
use case at least.

Lauren



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]