RE: [dita] mime type for DITA?

My main question is, what is the use case that is pushing us in the direction of creating a mime type? Are we considering this because of some specific need or are we considering this because it seems like it might be a good thing to do?

-Jeff

From: Erik Hennum [mailto:ehennum@us.ibm.com]
Sent: Tuesday, April 08, 2008 10:34 AM
To: Ogden, Jeff
Cc: dita@lists.oasis-open.org
Subject: RE: [dita] mime type for DITA?

Hi, Jeff and Paul:

Unfortunately, I won't be able to attend the TC this morning (at the CMS conference). I wanted to keep this thread alive.

Thanks for the sound (as usual) analysis. I can see the point of:

* A single application/dita+xml mime type for all documents that use DITA specialization.
* A format of topic or map to distinguish the two distinct documents.

I don't see the rationale for breaking out ditabase as a separate format. It still has topics -- just a list instead of a single root topic.

The type property could reflect either:

* The type of the root topic, first topic (for ditabase), or map by using the class attribute (replacing spaces with a separator)
* The shell

The advantage of declaring the shell is that it reflects not only the root type but also nested topics or topics after the first for ditabase as well as domains in either topic or map. However, the shell identifier tells us nothing about the type modules that went into the shell, so there's no basis for fallback unless the receiver understands the type hierarchy of the document.

The advantage of declaring the type is that fallback becomes possible for the root type. (If the document is declared as javaClass and the receiver understands apiClassifier, the receiver can still process the topic.) However, there's no way to declare domains or subtopics.

Enabling fallback processing by the receiver could be important for avoiding content negotation.

Finally, because it doesn't have specialization, I would think we need a separate mime type for application/ditaval+xml

What do you think?

Erik Hennum
ehennum@us.ibm.com

"Ogden, Jeff" <jogden@ptc.com> wrote on 03/28/2008 08:14:24 AM:

> Included below is an e-mail message from July 2006 (I guess the
> discussion I mentioned was a good bit more than the year ago that I
> remembered) that summarizes my thinking about DITA mime types at
> that time. This e-mail was internal to PTC/Arbortext. And as I
> mentioned before we never acted on this.
>
> -Jeff
>
> From: Ogden, Jeff
> Sent: Tuesday, July 18, 2006 11:47 AM
> Subject: RE: Browse item MIME type meeting minutes
>
> I spent some time reading RFC 2046, Multipurpose Internet Mail
> Extensions (MIME) – Part 2, and RFC 3023, XML Media Types, last
> night. I also looked at the MIME content types that are registered
> with IANA and didn’t see anything related to DITA.
>
> I’ve pretty much convinced myself that we don’t want to use the MIME
> Content types that are defined in RFC 3023 (text/xml or
> application/xml). RFC 3023 almost says as much and makes a
> suggestion for registering new xml related content types using the
> suffix “+xml”. And that seems like it might be the thing to do for
> DITA documents. It might look something like this:
>
> Content-type: application/dita+xml; charset={charset-value}
> format={dita | ditamap | ditabase} type={topic-type-value}
> navtitle=”navtitle text”
>
> All of the keyword parameters are optional.
>
> charset would be the same as defined in RFC 3023 and basically
> either matches or overrides the charset information in the XML header.
>
> format, type, and navtitle are similar to the DITA attributes of the
> same name. In general the values allowed for use with the MIME
> keywords are more restrictive than the DITA attributes and only the
> values shown are allowed.
>
> format=dita is assumed if format isn’t specified. ditabase isn’t one
> of the standard items for the DITA format attribute, but the DITA
> format attribute does allow other unspecified values.
>
> type only has meaning when format=dita. type values are the usual
> topic types (topic, concept, reference, task, glossentry or a
> specialization) accepted by the DITA type attribute. type=topic is
> assumed if type isn’t specified.
>
> navtitle is the navtitle or other title or other text from the DITA
> document that could serve as identifying text as determined by the
> application that saved or stored the DITA content.
>
> How does this look? Does it meet our needs for a content type for
> use with DITA documents stored in a CMS?
>
> Is topic-type-value good enough or does this need to somehow provide
> the Public ID value? For DITA purposes, I think the topic-type
> based on the root element name is fine, but I’m interested to know
> what Paul thinks.
>
> Is a format of “dita-fragment” and/or “ditamap-fragment” needed?
>
> -Jeff
>
>
>
> From: Ogden, Jeff
> Sent: Friday, March 28, 2008 10:27 AM
> To: Grosso, Paul; 'dita@lists.oasis-open.org'
> Subject: RE: [dita] mime type for DITA?
>
> Is there a specific use case that caused the issue of DITA mime
> type(s) to be raised again now?
>
> At PTC/Arbortext we talked about having an official mine type for
> DITA objects about a year ago, but we’ve been able to get by without
> it quite nicely so far.
>
> If we do try to get some sort of official mine type, I’d like to
> find a way that we could tell more than just that we have a DITA
> document. Ideally I’d like to know in order:
> 1. If we have a DITA document
> 2. If we have a DITA map, DITA topic, DITA ditabase, or ditaval document
> 3. What type of DITA document we have (topic, concept,
> reference, task, glossentry, map, bookmap, learningMap, …)
>
> I’d need to go back and refresh myself on the details of mime type
> syntax, but I vaguely remember that there were ways to provide more
> detailed information without creating completely new mine types.
>
> I agree with Paul that having an official mine type may not provide
> much additional benefit and so I’m not pushing for this myself, but
> if we do go forward I’d like to be able to get additional
> information beyond just DITA or not DITA.
>
> And even if we don’t go forward with an official mine type, I’d like
> to somehow encourage CMS implementers to include this level of
> detail in the CMS metadata associated with a DITA object so that
> someone can use or get this information as they search or browse
> without having to open each DITA object each time.
>
> -Jeff
>
>
> From: Grosso, Paul [mailto:pgrosso@ptc.com]
> Sent: Friday, March 28, 2008 9:59 AM
> To: dita@lists.oasis-open.org
> Subject: RE: [dita] mime type for DITA?
>
> While a mime type can define a fragment identifier syntax, there is
> always the question of what tools will recognize and implement that
> fragment identifier syntax. Presumably, in the DITA case, it will
> just be DITA tools which already recognize the syntax. So defining
> a mime type specific fragment identifier syntax does allow us to say
> our href values are true and official URIs, but it doesn't change
> too much in practice. (I don't see the argument as either strongly
> for or against whether we should define a dita mime type.)
>
> Using application/dita+xml to allow tools to recognize dita content
> sounds like a benefit. Again, though, you have to ask what tools
> will actually access the mime type and recognize--and do something
> special--with the dita mime type. The answer again is just dita
> tools which already recognize dita content. So the only benefit
> might be making it a bit easier for such tools to know they have
> dita without looking inside the content.
>
> But note that one can get a mime type only from mime headers, and
> one has mime headers for a file in only rare cases in practice (and
> half the time when you do have them, they are wrong or incomplete).
> The rest of the time, tools guess mime type by looking at the file
> extension or inspecting the content, and this can and is already done.
>
> So defining a mime type probably has only a minor benefit in practice.
>
> I would counsel against trying to define multiple mime types. Given
> the small benefit of mime types in general, if we try to get too
> complicated here, we'll pretty much guarantee that there will never
> be two fully interoperable implementations, and I don't see the
> benefit of multiple mime types.
>
> paul
>
>
> From: Erik Hennum [mailto:ehennum@us.ibm.com]
> Sent: Thursday, 2008 March 27 18:47
> To: dita@lists.oasis-open.org
> Subject: [dita] mime type for DITA?
> Hi, Technical Committee:
>
> Returning to an old question, should the committee take a position
> with respect to a mime type for DITA?
> http://lists.oasis-open.org/archives/dita/200408/msg00055.html
>
> A DITA mime type would let tools declare and recognize DITA content
> in HTTP headers, email, and so on without actually inspecting the
> content. As I understand Paul's note, a mime type would also provide
> a basis for defining the DITA reference syntax within URI standards:
> http://lists.oasis-open.org/archives/dita/200705/msg00040.html
>
> One might expect DITA to have a mime type similar to
> application/dita+xml following ordinary practice for XML vocabularies:
> http://en.wikipedia.org/wiki/XML_and_MIME
>
> DITA is an architecture, however, not a vocabulary. Section A.14 in
> the relevant RFC suggests that an extensible architecture should
> prepend qualification levels:
> ftp://ftp.isi.edu/in-notes/rfc3023.txt
>
> Applied to DITA, that would seem to call for a mime type that
> separates the declaration of the vocabulary (as defined by the shell
> for the document type) from the DITA architecture from the XML architecture.
>
> An application that recognizes a document type can process any
> document that generalizes to a valid instance of the recognized
> document type. Because of shell pluggability, however, the mime type
> alone can't reasonably provide a basis for determining the
> compatibility of a document type accepted by an application with the
> document type of the supplied document. (The mime type for the
> document type would have to encode the modules and ancestor modules
> included by the shell, effectively cramming the value of the domains
> attribute into an identifier.)
>
> A reasonable compromise might be for the mime type to identify only
> the base vocabulary. (That compromise also acknowledges the
> impracticality of registering all DITA shells as mime types.)
> Applications would have to inspect the domains attribute in the
> content for more specific evaluation of acceptability. This approach
> avoids creating a legacy that would have to be accomodated if future
> work solves the document type compatibility problem some other way.
>
> In summary, this approach would introduce two fundamental mime types
> for topics and maps:
> application/topic+dita+xml
> application/map+dita+xml
>
> Because the DITA values file isn't specializable (doesn't provide
> the architecture attributes), their mime types should identify the
> XML vocabulary but not the DITA architecture:
> application/ditaval+xml
>
>
> Hoping that's useful,
>
>
> Erik Hennum
> ehennum@us.ibm.com

dita message