[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Comments on ODF 1.2 CD Part 3: Packages
Dear ODF TC, please find my comments on the public draft of Open Document Format for Office Applications (OpenDocument) Version 1.2 Comments on Committee Draft 01. Apologies for being rather late with them. Thank you for your hard work on this important specification, I hope you will find my comments of use. If you require any additional information, I will be glad to provide it. Kind regards, Michiel Leenaars NLnet Foundation http://nlnet.nl ------------ Section 1.5: The note contains an unrequired additional "the" in the first sentence. Suggested: removal. ------------ Section 2.1: General. The first paragraph is unclear, some information should be added while other information - like the first part of the first sentence - is overly explanatory, not entirely correct and makes the spec less future proof. It points to the wrong appendix. Suggested alternative: OpenDocument uses a package file to store the XML content of a document as separate parts together with associated data as file entries in a nested directory structure in a single package file. Some of the file entries should be compressed to further reduce the storage and bandwidth footprint of the package. The package is a Zip file [ZIP], whose structure is described in Appendix C. ODF Packages impose additional structure on the Zip file to accomplish the representation of OpenDocument Format documents. The first line of the second paragraph: switch word 'creating' for 'comprising'. The same line has a non-descript link to 'OpenDocument part 1', rather than a specific bibliographic reference. Fourth line of the second paragraph: should be "but only a single document SHALL be contained". Question: Which character sets are support in file names? Question: Are path names and file names case-sensitive? Question: What is the maximum allowed depth of the nested directory structure? Question: Is there an advice on files that themselves are ZIP Question: Are Multipart ZIP files allowed? Question: There seems to be no clear reference to the serialized version of an ODF file, e.g. the mapping in a single file. ------------ It would be more logical to switch 2.3 and 2.2, given the order in which the information needs to be implemented. Currently it is only implicitly clear that the file mentioned in 2.3 needs to be the first file in the Zip container. ----------- Section 2.3: MIME Media Type The first sentence of the first paragraph does not specify exactly HOW the content of the file 'is' the MIME mediatype. Presumably this is a text file in a specific encoding, but this is not clear. Question: What character encodings are allowed within the 'mediatype' file itself? Question: Why is there no normative reference to the appropriate RFC or IANA concerning the MIME type? Question: Is it allowed or forbidden to add additional information in this file, including leading and trailing spaces or other characters? Question: Why is it not explicitly stated that applications SHALL not compress the beginning of the file, and SHALL have no extra data in the header? Question: Why is it not explicitly stated that applications SHALL not encrypte the mimetype file --------- Section 2.4: Encryption The first sentence is incomplete: OpenDocument packages may be encrypted by encrypting [insert] some or all of [/insert] the files within the package [insert], with the noted exception of the mediatype file mentioned in section 2.3 [/insert]. The summary of the encryption process is very specific to a small set of situations, and does not seem to cover all possible scenario's like use with hardware crypto. The recommendation that "Each file entry that is encrypted shall be compressed" does not make sense in all cases (like a spread of small RDF snippets), may have possible security implications and can hardly be justified from a technical point of view. This is a "may" at best. Question: can/should a file containing digital signatures (2.5) be encrypted? Is there an order to signing and encrypting (there are scenario's where you would want to sign before you encrypt, to not make the exact signature visible, and there are scenario's where you want to sign after you encrypt - to state the authenticity. Probably there are onion scenario's that require both). ----------- Section 2.4.2: Default Encryption Algorithm Under 1. replace "password" with "secret" to include hardware crypto solutions. ----------------- Section 2.5: Digital signatures In some cases (governments facing an open information act or similar) some information in documents needs to be made illegible (censored), yet it should be verifiable that the documents are the original ones and have not been tampered with - and an independent authority with adequate rights should be able to decrypt this and verify that the original document was used and that all the information that was hidden was legitimate. This 'black marker scenario' is not unlike the reverse of track changes, in that it needs an audit trail. This may require parts of a document be encrypted and signed, rather than just files. Can this be included? It is unclear to me why digital signatures are not using the metadata framework proposed in 2.6, rather than some file based and semantically empty solution. Section 2.7: Usage of IRIs Within Packages The first item in the list says: "The file entry path is the file name of the file within the Zip file which contains the relative IRI, including its relative path." Later in the section it is stated: "The relative path of this file within the ZIP file is determined by the following procedure:" Remark: Use of Zip and ZIP is not consistent. Question: What about the serialized version of ODF (mapping in a single file)? If there still is, there are probably relative IRI's there. The sentence "A fragment identifier, if it does exist, is removed." should be more specific. The last sentence of the note ('In particular the base [..]' is very unclear. Question: There is no statement or normative reference abouth any constraints with using character sets. Question: The upcoming practise of hybrid PDF/ODF files (PDF with the original ODF as an attachment) currently doubles the file weight of most files, because all images are included twice: in the PDF container and in the ODF file. Similar future scenario's for say web editors and ODF are not unlikely. Since the application that may handle them could very well be the same at both ends, could profit from having an explicit option to reference objects above itself. Question: Perhaps a statement could be made about use of absolute IRI's? ----------------------- Section 2.8: Preview Image The first sentence states: "Package producers should generate a preview image of the document that is contained in the package. It should be a representation of the first page, first sheet, etc. of the document." This would be more generic like this: "Package producers should generate a preview image for the document that is contained in the package. This will typically be a representation of the first page, first sheet, etc. of the document." Remark: There should be a normative reference for the PNG format, probably to ISO/IEC 15948:2004. Or is there any other specific version required? Question: Why is there a folder thumbnails if there can be only one thumbnail? Would it make sense to allow multiple sizes, in order for the whole spectrum of devices (from small screen devices to high end systems) to have a pleasing rendering? Question: Are there no constraints or recommendations with regards to the PNG itself, e.g. allow/disallow alpha layering/color spaces? Question: What about accessibility - is there information we can add for blind people that use a file browser to navigate through a large amount of ODF files? Where can we store this information, and are there recommendations to make to those making file browsers (including office applications, who may include this)? Question: Should we consider allowing an SVG preview image, if necessary with a PNG embedded? Question: Is there information we should store in the metadata of the PNG? Question: Some platforms may require other preview formats. E.g. NeoOffice currently seems to add a small pdf of the first page for the preview on Mac OS X. In fact, one could say that a low resolution version of a PDF rendering of an ODF file would be an interesting addition to the preview concept in ODF in some cases - it is a frozen, final format version. It might be interesting to recommend applications should be allowed to include the possiblity to include full PDF's in ODF as preview. That way hybrid PDF's are available both ways: PDF contained within ODF ('as last seen by the author', ODF contained within PDF ('meant as a final publication, but you can edit it if you want'. Of course these two mechanisms should not loop. ----------------------- Section 3: Manifest file Comment: It is not clear if multiple checksums are allowed. It seems from the outside that this might be a candidate to be labelled deprecated in favour of a fully RDF based metadata structure? ----------------------- Section 3.3: <manifest:file-entry> It is unclear what a document and a sub document is. The chosen solution is weak, because it only defines a number of cases. Question: If meaningful data from a specific application is stored within a package yet it is not a (sub) document, why can it not be added to the manifest:file-entry by the appliction and have as a requirement that it SHALL be kept intact by other applications - unless the user explicitly asks for removal of course? Question: Why are 'junk files' not added to the metadata, and tagged as such - so that applications know the file contains something they may throw away? ----------------------- Section 3.4: <manifest:encryption-data> Question: Are multiple checksums allowed and even encouraged? Or can you sign the checksum? ------------------------ Section 3.5: <manifest:start-key-generation> Question: Why not deprecate the text label in favor of the IRI? Question: What should an application do if it does not recognise the algorithm used? Should it show this to the user, and render the parts of the document if does recognise? ------------------------ Section 3.6: <manifest:start-key-generation> Remark: replace 'password' with 'secret'. ------------------------ Section 3.81: manifest:algorithm-name Suggestions: Split up between IRI and 'friendly name' for feedback to the end user. ------------------ Section 3.83: manifest:checksum-type Question: Why are the size of the sample (1k), the sample position (beginning) and the algoritm put into one attribute? This unnecessarily limits the possibilities to work with very reliable checksums. There is a forward reference to "extended conforming documents" that needs to be made explicit. -------------- Section 3.8.4 manifest:full-path The text says: The notation is the same as for the “filename” fields of the Zip file's central directory. Question: Why not a proper, absolute reference? Question: Likely there are quite a few scenario's (from dynamically constructed documents in inhouse apps to documents that 'live' in the internet cloud) where you would want an application to reference an online object as if it was part of a package. Is there a good reason not to allow manifest:full-path to allow to include URI's? ---------------- Section 3.8.6: manifest:start-key-generation-name Remark: replace 'password' with 'secret'. Remark: All applications should support encryption with SHA1, so remove the words "that support encryption" twice from the last regular paragraph. Remark: What does it mean to 'support a value'? Should they be able to correctly perform the calculations? Remark: There is a forward reference to "extended conforming documents" that needs to be made explicit. Remark: Why is the type of crypto linked to conformance? SHA1 support is mandatory, and it may not be portable. But it has little to do with conformance, more with SHA1 being a default. --------------------- Section 3.8.9: manifest:key-derivation-name Remark: The text is not strong in its conformance requirement. Why not "shall contain" instead of "should contain"? --------------------- Section 3.8.9: manifest:media-type Remark: The sentence "All files that have XML content should have the media type “text/xml”." seems overly enthusiastic and would dumb down the understanding of the application. What about XML types that have their own MIME type, like SVG or RDF? Remark: It is unclear what the mime type of a directory is? If a directory has multiple files in it, with multiple mime types - how should this be handled? --------------------- Section 3.8.11: manifest:preferred-view-mode The text is tailored more towards non-multimedia presentations, than to modern multimedia use. Question: What does it mean under "presentation-slide-show": "The author's preference is to open the document as presentation slide show."? Should this not say "The author's preference is to open the document in full screen mode, without the editing interface". Remark: Under "read-only": this is a preview; if there would be a fullblown preview PDF (as I suggested in Section 2.8) this might well suit the needs of the average user that sets this attribute? Remark: Add 'sound level' with a percentage of audio level as an option for situations where there is multimedia included with audio that should not be used. Remark: It should be possible to define whether the application may ask to update dynamic objects (like information from a database). If these objects are in a presentation, and the default application behaviour is to ask if they need to be updated that might be annoying. Remark: The sentence "The behavior for cases where the manifest:preferred-view-mode attribute is absent is implementation defined." allows for some unpredictable behaviour. Why not require 'edit' to be the default, and ask applications not to set this value unless the user explicitly demands it? --------------------- Section 3.8.12: manifest:salt Remark: Are there length constraints that are set to the salt sequence, in order to avoid stack overflow attacks? --------------------- Section 3.8.14: manifest:version Remark: "The specified version refers to the format specified in the media-type attribute of the manifest entry at which it occurs." means little to me. --------------------- Section 4.3: <xmldsig:Signature> Remark: It is unclear why the exception in "except that the base URI for resolving relative IRIs shall be the package base IRI." is made. Remark: It is unspecified how an application indicates that it used extensions to the [xmldsig-core] specification. ----------------------- Section 5: Metadata Manifest Files Remark: In the sentence "Metadata manifest files for sub documents shall be stored in the sub document's directories." there is room for misinterpretation as the directories can themselves contain directories. This should be: "Metadata manifest files for sub documents shall be stored at the top level of the sub document's directories." -------------------------- Section 6: Datatypes Question: Why is this chapter not at the start of the spec? Remark: there are some textual errors to be fixed, like "have have additional constrains". ------------------------- Section 7.2.1: Conforming OpenDocument Packages Remark: insert the word "following" before "requirements:" several times. Remark PD 1.2.4: The constraint that mimetype and meta-inf files cannot be in manifest:file-entry is not worthy of a conformance requirement - an app can just throw it away. Remark PD 1.3: why is this not before PD 1.2, given that it is about the first file of the zip file? Remark PD 1.3.2: Why not say something about the content/character encoding of the mimetype file? ------------------------ Section 7.4: Consumer conformance Some conformance dreams: From a conformance point of view and for making ODF future proof for new features a comforming application shall assume that all content in the package is meaningful. A conforming application shall thus not remove any files from within a package that it doesn't understand or know how to handle - unless there is an explicit requested by the user or in case the user has manually setting a policy. The user shall be able to instruct the application to keep all content within the package for non-destructive viewing and editing, and a conforming appliction shall be able to honor that request. --------------------------- Appendix C: Zip File Structure The normative reference at the bottom should go on top.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]