OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Comments on ODF 1.2 CD Part 3: Packages

Dear ODF TC,

please find my comments on the public draft of Open Document Format for
Office Applications (OpenDocument) Version 1.2 Comments on Committee
Draft 01. Apologies for being rather late with them.

Thank you for your hard work on this important specification, I hope you
will find my comments of use. If you require any additional
information, I will be glad to provide it.

Kind regards, 
Michiel Leenaars
NLnet Foundation


Section 1.5: The note contains an unrequired additional "the" in the
first sentence.

Suggested: removal.


Section 2.1: General.

The first paragraph is unclear, some information should be added while
other information - like the first part of the first sentence - is
overly explanatory, not entirely correct and makes the spec less future
proof. It points to the wrong appendix. Suggested alternative:

OpenDocument uses a package file to store the XML content of a document
as separate parts together with associated data as file entries in a
nested directory structure in a single package file. Some of the file
entries should be compressed to further reduce the storage and
bandwidth footprint of the package. The package is a Zip file [ZIP],
whose structure is described in Appendix C. ODF Packages impose
additional structure on the Zip file to accomplish the representation
of OpenDocument Format documents.

The first line of the second paragraph: switch word 'creating' for
'comprising'. The same line has a non-descript link to 'OpenDocument
part 1', rather than a specific bibliographic reference.

Fourth line of the second paragraph: should be "but only a single
document SHALL be contained".

Question: Which character sets are support in file names?
Question: Are path names and file names case-sensitive?
Question: What is the maximum allowed depth of the nested directory
structure? Question: Is there an advice on files that themselves are ZIP
Question: Are Multipart ZIP files allowed?
Question: There seems to be no clear reference to the serialized
version of an ODF file, e.g. the mapping in a single file.


It would be more logical to switch 2.3 and 2.2, given the order in
which the information needs to be implemented. Currently it is only
implicitly clear that the file mentioned in 2.3 needs to be the first
file in the Zip container.


Section 2.3: MIME Media Type

The first sentence of the first paragraph does not specify exactly HOW
the content of the file 'is' the MIME mediatype. Presumably this is a
text file in a specific encoding, but this is not clear.

Question: What character encodings are allowed within the 'mediatype'
file itself? Question: Why is there no normative reference to the
appropriate RFC or IANA concerning the MIME type? Question: Is it
allowed or forbidden to add additional information in this file,
including leading and trailing spaces or other characters? Question:
Why is it not explicitly stated that applications SHALL not compress
the beginning of the file, and SHALL have no extra data in the header?
Question: Why is it not explicitly stated that applications SHALL not
encrypte the mimetype file


Section 2.4: Encryption

The first sentence is incomplete:

OpenDocument packages may be encrypted by encrypting [insert] some or
all of [/insert] the files within the package [insert], with the noted
exception of the mediatype file mentioned in section 2.3 [/insert].

The summary of the encryption process is very specific to a small set
of situations, and does not seem to cover all possible scenario's like
use with hardware crypto.

The recommendation that "Each file entry that is encrypted shall be
compressed" does not make sense in all cases (like a spread of small
RDF snippets), may have possible security implications and can hardly
be justified from a technical point of view. This is a "may" at best.

Question: can/should a file containing digital signatures (2.5) be
encrypted? Is there an order to signing and encrypting (there are
scenario's where you would want to sign before you encrypt, to not make
the exact signature visible, and there are scenario's where you want to
sign after you encrypt - to state the authenticity. Probably there are
onion scenario's that require both).


Section 2.4.2: Default Encryption Algorithm

Under 1. replace "password" with "secret" to include hardware crypto


Section 2.5: Digital signatures

In some cases (governments facing an open information act or similar)
some information in documents needs to be made illegible (censored),
yet it should be verifiable that the documents are the original ones
and have not been tampered with - and an independent authority with
adequate rights should be able to decrypt this and verify that the
original document was used and that all the information that was hidden
was legitimate. This 'black marker scenario' is not unlike the reverse
of track changes, in that it needs an audit trail. This may require
parts of a document be encrypted and signed, rather than just files.
Can this be included?

It is unclear to me why digital signatures are not using the metadata
framework proposed in 2.6, rather than some file based and semantically
empty solution. 

Section 2.7: Usage of IRIs Within Packages

The first item in the list says: "The file entry path is the file name
of the file within the Zip file which contains the relative IRI,
including its relative path." Later in the section it is stated: "The
relative path of this file within the ZIP file is determined by the
following procedure:"

Remark: Use of Zip and ZIP is not consistent.
Question: What about the serialized version of ODF (mapping in a single
file)? If there still is, there are probably relative IRI's there.

The sentence "A fragment identifier, if it does exist, is removed."
should be more specific. The last sentence of the note ('In particular
the base [..]' is very unclear.

Question: There is no statement or normative reference abouth any
constraints with using character sets. Question: The upcoming practise
of hybrid PDF/ODF files (PDF with the original ODF as an attachment)
currently doubles the file weight of most files, because all images are
included twice: in the PDF container and in the ODF file. Similar
future scenario's for say web editors and ODF are not unlikely. Since
the application that may handle them could very well be the same at
both ends, could profit from having an explicit option to reference
objects above itself. Question: Perhaps a statement could be made about
use of absolute IRI's?


Section 2.8: Preview Image

The first sentence states:
"Package producers should generate a preview image of the document that
is contained in the package. It should be a representation of the first
page, first sheet, etc. of the document."

This would be more generic like this:
"Package producers should generate a preview image for the document
that is contained in the package. This will typically be a
representation of the first page, first sheet, etc. of the document."

Remark: There should be a normative reference for the PNG format,
probably to ISO/IEC 15948:2004. Or is there any other specific version
required? Question: Why is there a folder thumbnails if there can be
only one thumbnail? Would it make sense to allow multiple sizes, in
order for the whole spectrum of devices (from small screen devices to
high end systems) to have a pleasing rendering? Question: Are there no
constraints or recommendations with regards to the PNG itself, e.g.
allow/disallow alpha layering/color spaces? Question: What about
accessibility - is there information we can add for blind people that
use a file browser to navigate through a large amount of ODF files?
Where can we store this information, and are there recommendations to
make to those making file browsers (including office applications, who
may include this)? Question: Should we consider allowing an SVG preview
image, if necessary with a PNG embedded? Question: Is there information
we should store in the metadata of the PNG? Question: Some platforms
may require other preview formats. E.g. NeoOffice currently seems to
add a small pdf of the first page for the preview on Mac OS X. In fact,
one could say that a low resolution version of a PDF rendering of an
ODF file would be an interesting addition to the preview concept in ODF
in some cases - it is a frozen, final format version. It might be
interesting to recommend applications should be allowed to include the
possiblity to include full PDF's in ODF as preview. That way hybrid
PDF's are available both ways: PDF contained within ODF ('as last seen
by the author', ODF contained within PDF ('meant as a final
publication, but you can edit it if you want'. Of course these two
mechanisms should not loop.


Section 3: Manifest file

Comment: It is not clear if multiple checksums are allowed. It seems
from the outside that this might be a candidate to be labelled
deprecated in favour of a fully RDF based metadata structure?


Section 3.3: <manifest:file-entry>

It is unclear what a document and a sub document is. The chosen
solution is weak, because it only defines a number of cases. 

Question: If meaningful data from a specific application is stored
within a package yet it is not a (sub) document, why can it not be
added to the manifest:file-entry by the appliction and have as a
requirement that it SHALL be kept intact by other applications - unless
the user explicitly asks for removal of course? Question: Why are 'junk
files' not added to the metadata, and tagged as such - so that
applications know the file contains something they may throw away?


Section 3.4: <manifest:encryption-data>

Question: Are multiple checksums allowed and even encouraged? Or can
you sign the checksum?


Section 3.5: <manifest:start-key-generation> 

Question: Why not deprecate the text label in favor of the IRI?
Question: What should an application do if it does not recognise the
algorithm used? Should it show this to the user, and render the parts
of the document if does recognise? 


Section 3.6: <manifest:start-key-generation> 

Remark: replace 'password' with 'secret'.


Section 3.81: manifest:algorithm-name

Suggestions: Split up between IRI and 'friendly name' for feedback to
the end user.


Section 3.83: manifest:checksum-type

Question: Why are the size of the sample (1k), the sample position
(beginning) and the algoritm put into one attribute? This unnecessarily
limits the possibilities to work with very reliable checksums.

There is a forward reference to "extended conforming documents" that
needs to be made explicit.


Section 3.8.4 manifest:full-path

The text says: The notation is the same as for the “filename” fields of
the Zip file's central directory.

Question: Why not a proper, absolute reference?

Question: Likely there are quite a few scenario's (from dynamically
constructed documents in inhouse apps to documents that 'live' in the
internet cloud) where you would want an application to reference an
online object as if it was part of a package. Is there a good reason
not to allow manifest:full-path to allow to include URI's?


Section 3.8.6: manifest:start-key-generation-name

Remark: replace 'password' with 'secret'.
Remark: All applications should support encryption with SHA1, so remove
the words "that support encryption" twice from the last regular
paragraph. Remark: What does it mean to 'support a value'? Should they
be able to correctly perform the calculations? Remark: There is a
forward reference to "extended conforming documents" that needs to be
made explicit. Remark: Why is the type of crypto linked to conformance?
SHA1 support is mandatory, and it may not be portable. But it has
little to do with conformance, more with SHA1 being a default.


Section 3.8.9: manifest:key-derivation-name

Remark: The text is not strong in its conformance requirement. Why not
"shall contain" instead of "should contain"?


Section 3.8.9: manifest:media-type

Remark: The sentence "All files that have XML content should have the
media type “text/xml”." seems overly enthusiastic and would dumb down
the understanding of the application. What about XML types that have
their own MIME type, like SVG or RDF?

Remark: It is unclear what the mime type of a directory is? If a
directory has multiple files in it, with multiple mime types - how
should this be handled? 


Section 3.8.11: manifest:preferred-view-mode

The text is tailored more towards non-multimedia presentations, than to
modern multimedia use.

Question: What does it mean under "presentation-slide-show": "The
author's preference is to open the document as presentation slide
show."? Should this not say "The author's preference is to open the
document in full screen mode, without the editing interface".

Remark: Under "read-only": this is a preview; if there would be a
fullblown preview PDF (as I suggested in Section 2.8) this might well
suit the needs of the average user that sets this attribute?

Remark: Add 'sound level' with a percentage of audio level as an option
for situations where there is multimedia included with audio that
should not be used.

Remark: It should be possible to define whether the application may ask
to update dynamic objects (like information from a database). If these
objects are in a presentation, and the default application behaviour is
to ask if they need to be updated that might be annoying.

Remark: The sentence "The behavior for cases where the
manifest:preferred-view-mode attribute is absent is implementation
defined." allows for some unpredictable behaviour. Why not require
'edit' to be the default, and ask applications not to set this value
unless the user explicitly demands it? 


Section 3.8.12: manifest:salt

Remark: Are there length constraints that are set to the salt sequence,
in order to avoid stack overflow attacks?


Section 3.8.14: manifest:version

Remark: "The specified version refers to the format specified in the
media-type attribute of the manifest entry at which it occurs." means
little to me.


Section 4.3: <xmldsig:Signature>

Remark: It is unclear why the exception in "except that the base URI
for resolving relative IRIs shall be the package base IRI." is made.
Remark: It is unspecified how an application indicates that it used
extensions to the [xmldsig-core] specification.


Section 5: Metadata Manifest Files

Remark: In the sentence "Metadata manifest files for sub documents
shall be stored in the sub document's directories." there is room for
misinterpretation as the directories can themselves contain
directories. This should be: "Metadata manifest files for sub documents
shall be stored at the top level of the sub document's directories."


Section 6: Datatypes

Question: Why is this chapter not at the start of the spec?
Remark: there are some textual errors to be fixed, like "have have
additional constrains".


Section 7.2.1: Conforming OpenDocument Packages

Remark: insert the word "following" before "requirements:" several
times. Remark PD 1.2.4: The constraint that mimetype and meta-inf files
cannot be in manifest:file-entry is not worthy of a conformance
requirement - an app can just throw it away. Remark PD 1.3: why is this
not before PD 1.2, given that it is about the first file of the zip
file? Remark PD 1.3.2: Why not say something about the
content/character encoding of the mimetype file?


Section 7.4: Consumer conformance

Some conformance dreams:

From a conformance point of view and for making ODF future proof for
new features a comforming application shall assume that all content in
the package is meaningful. A conforming application shall thus not
remove any files from within a package that it doesn't understand or
know how to handle - unless there is an explicit requested by the user
or in case the user has manually setting a policy. The user shall be
able to instruct the application to keep all content within the package
for non-destructive viewing and editing, and a conforming appliction
shall be able to honor that request.


Appendix C: Zip File Structure

The normative reference at the bottom should go on top.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]