OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

tm-pubsubj-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Subject: Fwd: [tm-pubsubj-comment] More grumbling

I think Bernard meant to send this to the list, so here it is.

--- Begin Message ---
> * Bernard Vatant
> |
> | I suppose that means capacity for search engines to distinguish the
> | "signal" of PSIs among "noise" of billions of ordinary URIs, which
> | is something like SETI search. What should we look for? I agree it
> | has to be more explicit.

*Lars Marius Garshol
> In that case, probably what we should do is to establish some way to
> assert that a certain resource contain a PSI set using XTM, RDF, and

Good idea. Do you think about having some "format slot" inside the URI?

> For HTML I think we should take a close and serious look at HTML
> metadata profiles. I think they would be simple enough for people to
> use, yet formal and powerful enough for what we need. See
>   <URL: http://www.w3.org/TR/html401/struct/global.html#h- >

We should explore that indeed, but it is another story than structure of URIs themselves.

Semantics of PSIs may be carried at three levels
1. In the structure and syntax of URI string itself
2. In the Subject Indicator metadata
3. In the Subject Indicator "body" (full text description or other kind of
human-interpretable stuff)

OTOH semantics for computers and semantics for humans are different and we need both, so
what we have to figure clearly is which one we want to put where. If I understand you
well, you would be happy with:

No semantics at all in 1.
Semantics for both humans and computers in 2, in the form of metadata.
More semantics for humans in 3.

Steve's proposal is to put also some sort of semantics for humans and search engines in 1.

> | Indeed. We should be less fuzzy on that to reduice the noise. If we
> | recommend the use of the token "psi", we maybe should recommend a
> | more precise use of it, like its position in the URI string, and
> | maybe recommend a whole standard structure like:
> | http://psi.myorg.foo/scope/subject.html
> That means you have to own a domain in order to be able to create a
> PSI, and I don't think we really want that. It should be enough to own
> webspace in order to be able to create a URI. So I would very much
> prefer some form of structured metadata, as described above.

I'm not sure about what we want for that matter. Do we want *anybody* to be able to
publish PSIs?
Is not that somehow in contradiction with the need of stability and trust? This is
something we have to think about more seriously.
My view is that if publishers are really serious about PSIs, they should own dedicated
domains. But we have to discuss that.

You would prefer then a recommended structure with psi folder, like


> | IMO what we want to achieve is to allow search engines - and also
> | humans - to distinguish with the less noise possible URIs who are
> | declared PSIs from those who are not. Of course we will not get rid
> | of all the noise, but we can put it down to a reasonable level by
> | recommending a given URI structure.
> Aha, so this is about humans, too, is it? I guess that means we should
> come up with some text that can be put in a PSD to identify the
> intention that it serve as a PSD.

That is yet another story. Declaration of intention in 2 or 3 above. No, what I wish - but
is it technically sustainable, and how, that's the issue - is that the very structure and
syntax of the URI itself may identify it as a declared PSI with a good reliability - it
can't be 100% of course.
So search engines (and humans as well) could easily retrieve candidates PSI simply on the
view of their URIs, and confirm that on the view of required content and metadata in the

> | [human-interpretable, yet stable]
> |
> | Could you explain exactly where you think the contradiction lies?
> I explained it below that statement: something that is meaningful to
> humans is something humans are likely to want to change at some point.

Is it really an issue? Remember "Big Oak Cross" and "Good PSIs never die" discussions ...

> * Lars Marius Garshol
> |
> | Steve's and Bernard's response to that was that in some cases you
> | will know up front that this is not going to be a problem. There are
> | two responses to that.
> |
> | The first is that if that were true it is still a problem that this
> | paragraph provides too little guidance on how to tell those cases
> | apart. To be really effective we should provide more guidance.
> * Bernard Vatant
> |
> | I think the example should show that better than any abstract prose
> | here.
> Well, I don't think the "apples and oranges" example is actually going
> to help anyone figure out when to use human-interpretable identifiers
> and when not to.


> And I agree that it certainly is difficult to come up with some
> abstract prose that provides useful guidance on when to use meaningful
> identifiers and when not to, but I am not at all convinced that doing
> so is within the scope of this TC.

Why not? Who's gonna do that reflection if we don't? ;-)

> I think the best we can do is to either leave this issue alone
> completely, or else to write a general "best practices" document and
> cover this as one of the issues there.

You mean just setting the issue in the stable requirements and recommendations document,
and have, besides, a sort of living document where those kind of tricky and undecidable
questions, and pragmatic best practices about it, would be brought about from real use
cases? I like this idea.


--- End Message ---

Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC        <URL: http://www.garshol.priv.no >

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]

Powered by eList eXpress LLC