Re: [cmis] Text encoding in AtomPub

On 24 Jul 2009, at 18:48, Al Brown wrote:

Please see http://tools.ietf.org/html/rfc4287#page-14.

I read type="text" to map to text/plain. Other text/* media types can be specified. See clause #5.

5. If the value of "type" begins with "text/" (case insensitive), the content of atom:content MUST NOT contain child elements. -Al

Florent wrote:

Hm you're right, AtomPub doesn't allow sending text/* in base64. I hadn't realized its model was so restrictive. But anyway if you pick a file from the filesystem, how do you know that it's text, and that its media type is text/something? If you have out-of-band information that gives a media type of text/ something, but without an encoding, then something is missing for it to be interpreted correctly anyway by a processor that cares about media type. But I agree that legacy systems may have such partial information. I'd say in that case just guess the encoding if you can, or otherwise pretend it's ISO-8859-1 which will allow all bytes to be transmitted and stored back in the same way. Florent On Jul 24, 2009, at 1:01 PM, Florian Müller wrote: > If I treat it as a binary then I have to set a content type > different from text/* (see AtomPub spec 4.1.3.3). The only possible > content type would be application/octet-stream. That would be a loss > of information. > Through the SOAP interface I can upload the content AND set a text/* > content type. > > Florian > > > -----Original Message----- > From: Florent Guillaume [mailto:fg@nuxeo.com] > Sent: Friday, July 24, 2009 12:18 PM > To: Florian Müller > Cc: cmis@lists.oasis-open.org > Subject: Re: [cmis] Text encoding in AtomPub > > Basically if you don't know the encoding, then it's not text, it's a > binary. > > Florent > > On Jul 24, 2009, at 10:52 AM, fmueller@opentext.com wrote: > >> Hi, >> >> Here is a question to all the CMIS AtomPub client developers out >> there: >> >> I got stuck with document creation. In order to comply with AtomPub, >> text files must not be base64 encoded but enclosed in the <content> >> tag as plain text. This text must use the same encoding as the Atom >> XML around it. >> If I pick up a text file from the file system I don't know the >> encoding of this file and therefore I can't re-encode the text >> correctly. If the original encoding of the text file and the >> encoding of the Atom XML don't match by coincidence the document >> content in the repository will be different from the original >> content. That is, I can't rely on this method. Since creating >> documents in two steps (entry first, then adding content) is not >> supported by all repositories this is not viable option either. >> >> Do I make an error in reasoning or do we have a real issue here? >> >> >> Thanks, >> >> Florian >

cmis message