OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

oic message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [oic] Interpretation of terminal "/" in URIs and Zip


Hi Dennis,

first of all, I really did not understand the problem, but now I have 
consulted my colleague  Michael Stahl on this and he pointed me the 
problem and now I do see a problem.

He pointed me to the second bullet of:
http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.2.3

I will explain in detail below..


On 03/01/10 21:31, Dennis E. Hamilton wrote:
> 
> The last time I checked there is no provision in the ODF specification for
> access to components of a package from outside of the package.  Nothing we
> are doing here has to do with such references.  Furthermore, it is not up to
> us to know how references from outside are to be handled.  It has nothing to
> do with the ODF Format itself or the compliance of consumers and producers.
> 

It is certainly up to us (one of the TCs) to solve problems user have, 
when they are working with ODF.
One problem, not necessarily a specification problem, is that users are 
currently not able to access container within the ODF package format, 
which make the ODF as default information package quite useless.

Let's see what ODF 1.2 part 3 specifies:
http://docs.oasis-open.org/office/v1.2/part3/cd01/OpenDocument-v1.2-part3-cd01.html#a_2_7_Usage_of_IRIs_Within_Packages
refers to
http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1
as usually the base URI is not written explicitly in the package the 
base URI of the package is the Package Retrieval URI:
http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1.2

The retrieval URI for a test document on your site might be:
http://orcmid.com/blog/blog.odt

Valid URLs to references into and to the package are:
http://orcmid.com/blog/blog.odt/Pictures/HappyODF.png
http://orcmid.com/blog/blog.odt/Pictures/
http://orcmid.com/blog/blog.odt/Pictures
http://orcmid.com/blog/blog.odt

(Of course they won't work in a browser as long there is no web server 
extension handling ODF package similar to a directory.)

The BASE URIs for above cases are different
  http://orcmid.com/blog/blog.odt/Pictures/
  http://orcmid.com/blog/blog.odt/Pictures/
  http://orcmid.com/blog/blog.odt/
  http://orcmid.com/blog/

Due to http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.2.3 the BASE URI 
of the URI
URI:            http://orcmid.com/blog/blog.odt/Pictures/
BASEURI:  http://orcmid.com/blog/blog.odt/Pictures/

URI:            http://orcmid.com/blog/blog.odt/Pictures
BASEURI:  http://orcmid.com/blog/blog.odt/

differ, as 5.2.3 is excluding any characters after the right-most "/" in 
the base URI path,

Therefore I think it is worth now to write an issue.

Thanks for your patience,
Svante

> The rules in the ODF specification are entirely about relative references
> from Package files to other Package Files and subdocuments in the same
> package.  Furthermore, there are no entries that correspond to directories
> in the Zip model. 

> We only have them in the ODF Package manifest and they
> are an artificial construct introduced solely to distinguish subdocuments
> and provide their MIME types.  (When Zip is used to transport hierarchical
> folders of files between users, it is the Zip producer and the Zip consumer
> software that provides a mapping to file-system structures of the Zip
> content.  This is a popular out-of-band convention that is not part of the
> Zip specification at all.
> 
> In the context of ODF Packages, what there is to worry about is the actual
> form of the Zip file name on a Package file, how that relates to the
> manifest:full-path attribute value for Package files, how subdocuments and
> the overall document itself are reflected in the manifest by a separate
> non-colliding method, and then how *relative* URIs are to be resolved
> against the manifest to Package sub-documents and, separately, Package
> Files.   
> 
> I *will* file JIRA issues against ODF 1.2 Part 3 CD01 on this topic.
> 
>  - Dennis
> 
> PS: The RFCs for URIs and IRIs make it clear that having path segments pass
> through different technologies requires profiling for those specific
> technologies, as when going into a Package from outside, going outside of a
> Package from inside, and certainly going out across and into another
> package, if the second were ever provided for in the ODF specification. (It
> is forbidden in ODF 1.1, I will have to check what it says now for ODF 1.2.)
> We would do well to make sure we are following those RFCs in this matter.
> 
> PPS: We do have an interesting problem concerning the understood base that
> relative URIs are expected to be resolved against, since from inside the
> resource (the ODF Package) we have no clue what its URI is, if any, as a
> resource strictly for resolving an internal relative reference.  Since the
> document is mobile and may even be processed from a stream, there is no
> fixed base that makes sense for resolution unless there are a manufactured
> immutable ones that travel with the document and are known internal to the
> package.  Since this matters a great deal for internal cross-references with
> RDF XML, we might need to think about this a lot harder.  At the moment,
> anything that extracts embedded RDF that uses relative resource references
> into Package Files will be hard-pressed to maintain those references to
> anything at all.  I suspect you have run into that.
> 
> -----Original Message-----
> From: Svante.Schubert@Sun.COM [mailto:Svante.Schubert@Sun.COM] 
> Sent: Monday, March 01, 2010 09:42
> To: dennis.hamilton@acm.org
> Cc: OIC TC List
> Subject: Re: [oic] Interpretation of terminal "/" in URIs and Zip
> 
> Dennis,
> 
> is this is an issue to be discussed or is it settled.
> Frankly speaking, I can not see a problem. The common server behavior is 
> not standardized nor do I see that it influences us to limit the 
> definition of URLs within the ODF package.
> On the other hand I do see a problem when narrowing existing standards 
> as URL/URI for the package. By defining extra constraints the transition 
> of URL/URI from outside the package into it, become more complicated for 
> the ODF user.
> 
> Thanks,
> Svante
> 
> 
> On 02/11/10 20:44, Dennis E. Hamilton wrote:
>> Although this is an important discussion for ODF 1.2 Part 3 Package
> cleanup, it came up on our OIC call Wednesday and I wanted to follow up here
> first.  I will repost it to the ODF TC list and any discussion should be
> over there.
>> Let's see if this helps make sense of things.
>>
>> Short answer (11, below): 
>>
>> For me, it follows naturally that one would refer to a subdocument in a
> URI using a form that resolves to a full-path that matches "prefix/", and
> for which there is a manifest:full-path="prefix/" file entry.  This is then
> completely distinct from resolving to a package file that might happen to
> have the full-path "prefix" (no ending "/"), something which has not, so
> far, been prohibited in the ODF 1.2 Part 3 specification and which is
> definitely not prohibited in the URI RFC specifications.
>>  - Dennis
>>
>> ANALYSIS
>>
>>  1. First, it is important to understand how there are three features in
> the resolution of URIs to resources:
>>   a. The client or user agent that is processing a supplied URI
>>
>>   b. The URI a service receives from an user agent or client (not
> necessarily identical to the one in (a)
>>   c. The specific response that the URI of (b) elicits from the service,
> including any indication of how the URI was understood 
>>  2. Here's a simple exercise:
>>
>>  a. Open a browser
>>
>>  b. Give it this *exact* URI as the address to visit:
> "http://orcmid.com/blog"; making sure there is no "/" on the end.
>>  c. Go to that location.
>>
>>  d. Notice the URI that is now shown in the address bar of the browser.
> For me, it is "http://orcmid.com/blog/";.  
>>  3. What happened here?
>>
>>  4. The web server, when given "http://orcmid.com/blog"; detects that it
> has no resource at it's domain-resolved location, "blog", but it has a
> directory at that location.  So the server performs its default behavior for
> directory "blog/" *and* it effectively returns the corrected URL
> "http://orcmid.com/blog/"; to indicated how it resolved the request.  As you
> know, some servers return indexes of their directories when they have
> requests like that, some refuse to resolve the request, and others return a
> default page.  In my case, you are seeing the default page
> "http://orcmid.com/blog/default.asp";.
>>  5. It is also the case that there might be a resource for
> "http://orcmid.com/blog"; as well as a directory.  This is entirely
> permissible although most web servers don't have such functionality.  The
> same goes for file systems, where there might be a resource or data stream
> having the same name as a directory, and there might actually be content at
> the directory level separate from the content *within* the directory.  The
> rules for URLs tolerate these variations.  The WebDAV folks had to deal with
> this because in WebDAV you can create and set web resources and also create
> collections (usually implemented in directories).  They needed ways for
> users of WebDAV to have the various cases work.
>>  6. Now we get to the problem with resources (package files) and
> subdocuments (sets of one or more package files) and the mappings between
> URI, manifest:full-path, and the actual content names in Zip packages.
>>  7. First, for URIs (not file-system addresses), the appearance of ".",
> "..", "./", "../", etc. are generally client-side or user-agent material (as
> are #fragment IDs, usually).  These are typically not sent to services in
> http requests for example.  Instead, the relative URI is turned into an
> absolute one in some way, and the absolute URI is communicated to the
> service.  The URI RFCs describe the convention for this.
>>  8. Now look at the full-path rules, such as they are.  It is simple to
> infer that a subdocument is a set of package files where the
> manifest:full-path entries for those package files have a common prefix,
> "prefix/" which is given as the manifest:full-path of the entry that
> provides the MIME type for the subdocument.  This file entry has no package
> file (there is nothing in the Zip with that name) and there is no
> encryption, of course.  
>>     It is not prohibited by the rules for URI resolution, and it certainly
> doesn't impact the use of Zip (which is indifferent to "prefix/" as a
> prefix), for there to also be a package file with
> manifest:full-path="prefix" (without the terminal "/").  This is not a
> subdocument and it is a perfectly good name for a package file.
>>  9. Basically, we could say that contents of a package are all of the
> package files that have names of the form "prefix/more" and for which there
> is a manifest:full-path="prefix/" file entry.
>>  10. There is one exception to (10).  For the document itself, reflected
> in manifest:full-path="/" (prefix is empty), the document parts are not
> identified with manifest:full-path="/more" because package files are not
> permitted to have names that begin with "/".  (That is a requirement of the
> Zip specification.)
>>  11. For me, it follows naturally that one would refer to a subdocument in
> a URI using a form that resolves to a full-path that matches "prefix/", and
> for which there is a manifest:full-path="prefix/" file entry.  This is then
> completely distinct from resolving to a package file that might happen to
> have the full-path "prefix", something which has not, so far, been
> prohibited in the ODF 1.2 Part 3 specification.
>>
>>  - Dennis
>>
>> Dennis E. Hamilton
>> ------------------
>> NuovoDoc: Design for Document System Interoperability 
>> mailto:Dennis.Hamilton@acm.org | gsm:+1-206.779.9430 
>> http://NuovoDoc.com http://ODMA.info/dev/ http://nfoWorks.org 
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this mail list, you must leave the OASIS TC that
>> generates this mail.  Follow this link to all your TCs in OASIS at:
>> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]