OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Encryption and data leakage

I agree (except for file names, at least OOo doesn't keep file names,
but in the end the spec doesn't guide what to do anyway).

I would like to avoid way 1+2, but to directly go with way 3.

a) It's IMHO the better approach
b) Don't introduce interim changes, This breaks compatibility twice, and
I am even not not sure if somebody would implement them at all.

Another not yet mentioned approach would be to use/allow standard zip
encryption including directory encryption (instead of way 1+2, but not
as a replacement for 3).
But I don't know if this would allow for different algorithms, nor do I
know if the standard zip encryption is considered to be strong.
I guess there are reasons that it hasn't be considered for ODF
encryption from the beginning...


robert_weir@us.ibm.com wrote, On 05/11/10 19:42:
> The approach we inherited from ODF 1.1 encrypts each file in the ZIP 
> independently.  Although the contents of the files are not viewable due to 
> the encryption, there are bits of information that  potential "leak", such 
> as:
> 1) The file size
> 2) The file date
> 3) The file name
> 4) The file mime type
> 5) The hash of the first 1024 bytes of the file
> For example, even in an encrypted document I could see a file name called 
> "big-secret-takeover-june-3.jpg" and know some information that the person 
> who wrote the encrypted document might be rather surprised to see in the 
> open.
> Although not required by ODF, an implementation, if it is clever, can 
> avoid some of these leakages.  For example, the timestamp of the file can 
> be turned into the time of encryption rather than the original time stamp. 
>  And the file name can be randomized rather than indicate the original 
> file name.  This might be fine for ODF, since these time stamps and file 
> names are not necessary to be preserved.  So long as as we preserve 
> referential integrity of the package, the names of images are not 
> significant.
> However we still should be concerned here.  First, the reason we split 
> Part 3 into its own part was the believe that it could be useful for 
> purposes other than just ODF 1.2.  Many of us hoped that it would other 
> uses.  But I don't think we can assume that all uses can ignore the 
> original file names and time stamps.  These might be significant for some 
> uses. 
> Second, even within ODF, especially if we allow package extensions,  we 
> might see items added to packages where the names of files (which may 
> ultimately end user-defined) cannot safely be renamed to random names. For 
> example, there may be referential integrity constraints that a generic ODF 
> processor is not aware of.  Maybe there is RDF that points to a contained 
> image or other package resource.  In any case, the approach is very 
> fragile.
> Finally, even without extensions, and with the use of randomized names, we 
> still leak information, based on knowing the size and hash of the first 
> 1024 bytes of the file.  For example, if I have a copy of "
> big-secret-takeover-june-3.jpg" I can easily check to see what encrypted 
> documents also contain that same image.  I can similarly probe for any 
> other resource where I know in advance its size and or contents. 
> There are three ways of getting around this problem.  (Or at least two 
> that come to mind).  One is to keep a "shadow directory" for the ZIP, that 
> contains the original names, time stamps, and sizes of the files.  Encrypt 
> this  "shadow directory" when the document is encrypted.  For example 
> encrypted file, prepend it with some random bytes (not sure what is 
> optimal) in order to prevent data leakage of original size and hash of 
> first 1024 bytes.
> Another approach is to encode the original full path of the file, appended 
> with its timestamp, using the original derived key, base64 encode that, 
> and then write that out as the full path for the ZIP entry. That way you 
> do not need another file in the ZIP. 
> The other way is to move to a whole-package encryption method, rather than 
> trying to do this file-by-file. 
> -Rob
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]