RE: [dita] mime type for DITA

Erik asked:

> Jeff, do you have an RFC reference for the format and type mimetype keywords?

> I could swear I've seen them but can't find it for the life of me.

I don’t think there was anything as specific as a definition for format and type. There are these general statements on Content-Type parameters from RFC 2046:

After the type and subtype names, the remainder of the header field is simply a set of parameters, specified in an attribute/value notation.  The ordering of parameters is not significant.

Parameters are modifiers of the media subtype, and as such do not fundamentally affect the nature of the content.  The set of meaningful parameters depends on the media type and subtype.  Most parameters are associated with a single specific subtype.  However, a given top-level media type may define parameters which are applicable to any subtype of that type.  Parameters may be required by their defining media type or subtype or they may be optional.  MIME implementations must also ignore any parameters whose names they do not recognize.

I think this leaves the definitions for format and type up to us (the DITA TC). I suggest that they should be defined to be the same as or a sub-set of the values that are allowed for the attributes of the same name on the topicref element.

Back in 2006 I said that all of the parameters should be optional, but I am now wondering if format and charset sould be required and type should be optional.

Do we want an optional navtitle parameter?

Does application/dita+xml have to mean documents that use DITA specialization or could it mean documents that conform to the DITA specification? The later would allow us to use application/dita+xml with format=ditaval and eliminate the need for a separate MINE type registration for ditaval.

-Jeff

From: Erik Hennum [mailto:ehennum@us.ibm.com]
Sent: Friday, April 18, 2008 8:57 PM
To: dita@lists.oasis-open.org
Subject: [dita] mime type for DITA

Hi, DITA-Minded Technical Committee:

I thought it might be useful to refresh the thread about mime types with a summary of the proposal prior to the meeting.

This proposal is really a light modification on Jeff's analysis. (Jeff, do you have an RFC reference for the format and type mimetype keywords? I could swear I've seen them but can't find it for the life of me.)

Please note the DISCUSSION item below about the type.

Thanks,

Erik Hennum
ehennum@us.ibm.com

======================
= A mime type for DITA
======================

MOTIVATION:

Where client tools needs to download documents from a CMS (potentially via WebDAV) and route requests to open those documents to the desktop editor that understands DITA specialization.

Such client tools needs to distingish DITA topics and maps (with base or specialized vocabularies) from other XML documents (XMI, SVG, what have you) provided by the CMS system. The CMS typically stores the document without an extension, so the tool can't rely on file-system mechanisms for recognizing DITA vocabularies. If the tool simply passes through the document from the CMS to the client Operating System, the web brower might open the XML topic instead of the DITA-aware editor. Without a MIME type, the tool will have to open and parse the initial part of the file to recognize DITA topics and maps, which will be very inefficient.

PROPOSAL:

The mime type record (such as an HTTP header) has the following fields:

* The application/dita+xml mime type identifies any document that use DITA specialization.
* The format property identifies the base vocabulary for the specialized document (in particular, topic or map).
* The type property identifies the specialized vocabulary.

The character set and encoding fields have no DITA-specific considerations.

NOTE: Topics of ditabase don't have a separate format because they are just topic documents with a list of topics rather than a single topic.

NOTE: Because it doesn't have specialization, the DITA values file requires a separate mime type of application/ditaval+xml. If it ever becomes specialized, the mime type of application/dita+xml and format of ditaval would be applicable.

DISCUSSION: The type property could reflect either:

* The type of the root topic, first topic (for ditabase), or map by using the class attribute (replacing spaces with a legal separator character)
* The name of the shell

The case for using the topic type: Fallback processing becomes possible for the root type prior to opening the document. If the document is declared as javaClass and the receiver understands apiClassifier, the receiver knows that it can process the vocabulary before opening the document. If the receiver cares about domains or subtopics, however, the receiver must open the topic to detect them.

The case for declaring the shell: If the receiver knows the shell, the receiver understands everything about the document without opening it including the root type, nested topics or topics after the first for ditabase as well as domains in either topic or map. If the receiver doesn't know the shell identifier, however, the receiver would have to open the document to determine fallback processing.

REFERENCE:

http://en.wikipedia.org/wiki/MIME
http://tools.ietf.org/html/rfc2046
http://tools.ietf.org/html/rfc3023

dita message