OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

legaldocml-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [legaldocml-comment] [COMMENT] Discrepancy in akn-media-v1.0-csprd01.pdf Document


Dear Lewis, 


On 22/giu/2015, at 05:17, Lewis John Mcgibbney <lewis.mcgibbney@gmail.com> wrote:

> Hi Folks,
> Based on the recent announcement for 30-day Public Review for LegalDocML's Akoma Ntoso V1.0 - ends June 5th [0] (sorry for missing the official review date I hope that the comment is still valid), I *think* I've discovered a discrepancy within some content relating to SubSection '2.9 Additional information' of the Additional Akoma Ntoso Media Type Version 1.0 Committee Specification Draft 01/Public Review Draft 01 25 March 2015.
> Specifically the content includes the following
> 
> 2.9
> Additional information
> 1. Deprecated alias names for this type : None
> 2. Magic number(s) : There is no single initial octet sequence
> that is always
> 3. File extension(s) : Akoma Ntoso documents are most often identified with the extension .akn or .xml.
> 4. Macintosh file type code : TEXT
> 5. Object Identifiers: None
> 
> Please note bullet 2 above "Magic number(s) : There is no single initial octet sequence that is always"

Thank you for pointing this out for us. This is clearly a copy/paste done wrong. The sentence should be: 

" There is no single initial octet sequence that is always present in Akoma Ntoso documents. "

> This seems incomplete.
> I am implementing an AkomaNtosoParser for the Apache Tika project and have therefore been implementing the IANA MimeType and making an attempt to recognize magic bytes which may be indicative if AKN documents for MimeType detection.

I would think that trying to deduce an XML media type from magic bytes is in general unreliable and overly complex. The rfc 2376 (XML Media Types) [1] has this to say about magic numbers: 

> Magic number(s): none
> 
>       Although no byte sequences can be counted on to always be present,
>       XML entities in ASCII-compatible charsets (including UTF-8) often
>       begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"), and those in
>       UTF-16 often begin with hexadecimal FE FF 00 3C 00 3F 00 78 00 6D
>       or FF FE 3C 00 3F 00 78 00 6D 00 (the Byte Order Mark (BOM)
>       followed by "<?xml").  For more information, see Annex F of [REC-
>       XML].


There are simply too many different ways for an XML document to exist, let alone an XML document that is an Akoma Ntoso document. An Akoma Ntoso document is, by definition, an XML document (recognized as such according to the specification of RFC 2376, which defines and uses (correctly) the Akoma Ntoso Namespace. This is not something done at the byte level, but by the XML parser alone. 

Thank you and ciao

Fabio Vitali


> 
> Thanks
> Lewis
> 
> [0] https://groups.google.com/forum/#!topic/akomantoso-xml/57W6daZNt_4
> [1] http://docs.oasis-open.org/legaldocml/akn-media/v1.0/csprd01/akn-media-v1.0-csprd01.pdf#page=7&zoom=auto,69,533
> 
> 
> -- 
> Lewis 


--

Fabio Vitali                                          The sage and the fool
Dept. of Informatics                                     go to their graves
Univ. of Bologna  ITALY                               alike in this respect:
phone:  +39 051 2094872                  both believe the sage to be a fool.
e-mail: fabio@cs.unibo.it                  Where, then, may wisdom be found?
http://vitali.web.cs.unibo.it/   Qi, "Neither Yes nor No", The codeless code



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]