OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [office] A few of specific examples]


Rob,

We already allow arbitrary extension by foreign elements and attributes 
and you offer no examples of actual "evil" use of them with ODF.

Moreover, careful readers will note that ODF 1.2 allows *arbitrary* 
extension by its new metadata mechanism.

Why? Well, because no standards committee, not even ours, can anticipate 
every need of every user. So from the very beginning ODF has allowed 
several mechanisms for arbitrary extension of the defined markup.

Florian on our last call, had a good use case for older ODF applications 
to process later ODF documents, whose new elements and atttibutes are 
"foreign" to them.

BTW, ODF and markup technologies are *not* religious issues and so 
differences in markup choices are not "evil." I already have a religion, 
one that has existed a good bit longer than markup technologies and I am 
not in the market for a new one. I think it would help if we all simply 
try to do the best jobs we can and not view markup, including ODF, 
through lens of good versus evil.

Granted I have labored over ODF for years, both as a volunteer and more 
recently as a paid editor and I prefer its choices over other possible 
choices. That's understandable. What I refuse to do is to brand other 
choices as "evil" or somehow beyond the pale. If our choices are the 
better ones, and I certainly think they are, then we should be able to 
demonstrate that to others. Note I said demonstrate, not simply shout it 
at others. There is a difference.

And yes, yes, others may refuse to listen or persist in disagreeing. 
That is certainly their privilege. As I have pointed out in another 
context, listening should not be equated with agreement. I can listen 
quite attentively and still disagree in a civil manner.

Hope you are having a great day!

Patrick

robert_weir@us.ibm.com wrote:
> Some specific examples of how and why arbitrary proprietary extensions 
> are evil.
>
> Two common concerns with users is the need for privacy and security.  
> The issue of personally-identifying meta-data is increasingly in the 
> news. Some products, like Microsoft Office, have a built-in operation 
> that will remove such information from a Word document.  There are 
> also third-party application that will strip such metadata from a 
> document.
>
> So, suppose you want to write such an operation for an ODF document.. 
> What do you do?  Simple enough, look to meta.xml scrub extension 
> elements under <office:meta>, etc.  The places where metadata is 
> stored is deterministic.  The standard is clear where they are.  But 
> allow arbitrary extensions everywhere, and you have no idea where the 
> metadata is.  Your ability to write a generic tool like this is made 
> far more difficult.  You can't tell whether an extension contains 
> metadata, content, processing instructions, executable code, or whatever.
>
> Similarly, there is the  need to scan a document for virus or 
> malicious macros.  Remember all the Word viruses from a few years 
> ago?  The risk is still there.  Antivirus vendors have been somewhat 
> successful in addressing such risks with mail gateway filters which 
> act in part by examining file attachments and scanning them for risky 
> content.  As a policy some companies will disallow any external 
> document with a macro to go through their firewall.  So how would you 
> do this for an ODF document? Well, ODF says scripts go into the 
> <office:script> element.  So the simple solution is to scan for that 
> element and if it exists, to flag the document as a higher risk.  But 
> with arbitrary proprietary extensions, how do we know that they don't 
> contain executable content?  How does the virus scanner handle 
> arbitrary elements, which may contain metadata, content, processing 
> extensions, scripts or anything?  The easiest solution would be to ban 
> documents that contain extensions.  Is that what we want?
>
> Similarly, a search engine will want to find all text in a document 
> for indexing.  Reading the ODF specification it is clear what is 
> content and what is not, so a proper indexer can be written.  But with 
> arbitrary proprietary extensions, this task is impossible,  I would 
> not know whether the extensions elements should be indexed or not.
>
> Also, a program that translates a document from one language to 
> another, preserving all formatting and styles.  Reading the ODF spec, 
> I can easily determine what elements are content and which are not and 
> then run machine translation on just the content.  But with arbitrary 
> proprietary extensions, I have no idea.  I risk doing a partial 
> translation, if the extension elements represent user-visible content.
>
> There is also the question of document referential integrity.  Suppose 
> I want to write a program that takes a large ODF document and splits 
> it up into chapters, one ODF document per chapter.  According to the 
> ODF standard this is easy.  I can trace the style dependencies and 
> duplicate what is needed and make several documents from a single ODF 
> document. Similarly, I could take multiple ODF documents and combine 
> them into a single document, merging the styles as needed.  But in the 
> presence of arbitrary proprietary extensions I cannot do either of 
> these operations safely, since I do not understand the semantics of 
> these extensions.
> Now I can imagine a well-thought out extensibility mechanism that 
> would address the above concerns.  I'd certainly entertain any such 
> proposals. But merely saying "The X in XML standards for 
> eXtensibility" is not a considered engineering approach.   
> Extensibility requires that we think out issues such as versioning, 
> content negotiation, fall-backs, namespacing, round-tripping, as well 
> as offer clear guidelines for how extensions declare whether they 
> contain translatable text, metadata, executable code, or other 
> categories of importance.  The fail-safe approach is to remove this 
> option until such time as we can do it right.
> If there is sufficient interest to work on this, we could create a new 
> subcommittee on extensibility to work on developing a detailed 
> proposal in this area, obviously for consideration post ODF 1.2.
>
> -Rob
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php
>
>   

-- 
Patrick Durusau
patrick@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

--- Begin Message ---
Rob,

We already allow arbitrary extension by foreign elements and attributes 
and you offer no examples of actual "evil" use of them with ODF.

Moreover, careful readers will note that ODF 1.2 allows *arbitrary* 
extension by its new metadata mechanism.

Why? Well, because no standards committee, not even ours, can anticipate 
every need of every user. So from the very beginning ODF has allowed 
several mechanisms for arbitrary extension of the defined markup.

Florian on our last call, had a good use case for older ODF applications 
to process later ODF documents, whose new elements and atttibutes are 
"foreign" to them.

BTW, ODF and markup technologies are *not* religious issues and so 
differences in markup choices are not "evil." I already have a religion, 
one that has existed a good bit longer than markup technologies and I am 
not in the market for a new one. I think it would help if we all simply 
try to do the best jobs we can and not view markup, including ODF, 
through lens of good versus evil.

Granted I have labored over ODF for years, both as a volunteer and more 
recently as a paid editor and I prefer its choices over other possible 
choices. That's understandable. What I refuse to do is to brand other 
choices as "evil" or somehow beyond the pale. If our choices are the 
better ones, and I certainly think they are, then we should be able to 
demonstrate that to others. Note I said demonstrate, not simply shout it 
at others. There is a difference.

And yes, yes, others may refuse to listen or persist in disagreeing. 
That is certainly their privilege. As I have pointed out in another 
context, listening should not be equated with agreement. I can listen 
quite attentively and still disagree in a civil manner.

Hope you are having a great day!

Patrick



robert_weir@us.ibm.com wrote:
> Some specific examples of how and why arbitrary proprietary extensions are 
> evil.
>
> Two common concerns with users is the need for privacy and security.  The 
> issue of personally-identifying meta-data is increasingly in the news. 
> Some products, like Microsoft Office, have a built-in operation that will 
> remove such information from a Word document.  There are also third-party 
> application that will strip such metadata from a document.
>
> So, suppose you want to write such an operation for an ODF document.. What 
> do you do?  Simple enough, look to meta.xml scrub extension elements under 
> <office:meta>, etc.  The places where metadata is stored is deterministic. 
>  The standard is clear where they are.  But allow arbitrary extensions 
> everywhere, and you have no idea where the metadata is.  Your ability to 
> write a generic tool like this is made far more difficult.  You can't tell 
> whether an extension contains metadata, content, processing instructions, 
> executable code, or whatever.
>
> Similarly, there is the  need to scan a document for virus or malicious 
> macros.  Remember all the Word viruses from a few years ago?  The risk is 
> still there.  Antivirus vendors have been somewhat successful in 
> addressing such risks with mail gateway filters which act in part by 
> examining file attachments and scanning them for risky content.  As a 
> policy some companies will disallow any external document with a macro to 
> go through their firewall.  So how would you do this for an ODF document? 
> Well, ODF says scripts go into the <office:script> element.  So the simple 
> solution is to scan for that element and if it exists, to flag the 
> document as a higher risk.  But with arbitrary proprietary extensions, how 
> do we know that they don't contain executable content?  How does the virus 
> scanner handle arbitrary elements, which may contain metadata, content, 
> processing extensions, scripts or anything?  The easiest solution would be 
> to ban documents that contain extensions.  Is that what we want?
>
> Similarly, a search engine will want to find all text in a document for 
> indexing.  Reading the ODF specification it is clear what is content and 
> what is not, so a proper indexer can be written.  But with arbitrary 
> proprietary extensions, this task is impossible,  I would not know whether 
> the extensions elements should be indexed or not.
>
> Also, a program that translates a document from one language to another, 
> preserving all formatting and styles.  Reading the ODF spec, I can easily 
> determine what elements are content and which are not and then run machine 
> translation on just the content.  But with arbitrary proprietary 
> extensions, I have no idea.  I risk doing a partial translation, if the 
> extension elements represent user-visible content.
>
> There is also the question of document referential integrity.  Suppose I 
> want to write a program that takes a large ODF document and splits it up 
> into chapters, one ODF document per chapter.  According to the ODF 
> standard this is easy.  I can trace the style dependencies and duplicate 
> what is needed and make several documents from a single ODF document. 
> Similarly, I could take multiple ODF documents and combine them into a 
> single document, merging the styles as needed.  But in the presence of 
> arbitrary proprietary extensions I cannot do either of these operations 
> safely, since I do not understand the semantics of these extensions. 
>
> Now I can imagine a well-thought out extensibility mechanism that would 
> address the above concerns.  I'd certainly entertain any such proposals. 
> But merely saying "The X in XML standards for eXtensibility" is not a 
> considered engineering approach.   Extensibility requires that we think 
> out issues such as versioning, content negotiation, fall-backs, 
> namespacing, round-tripping, as well as offer clear guidelines for how 
> extensions declare whether they contain translatable text, metadata, 
> executable code, or other categories of importance.  The fail-safe 
> approach is to remove this option until such time as we can do it right. 
>
> If there is sufficient interest to work on this, we could create a new 
> subcommittee on extensibility to work on developing a detailed proposal in 
> this area, obviously for consideration post ODF 1.2.
>
> -Rob
>
> ---------------------------------------------------------------------
> To unsubscribe from this mail list, you must leave the OASIS TC that
> generates this mail.  Follow this link to all your TCs in OASIS at:
> https://www.oasis-open.org/apps/org/workgroup/portal/my_workgroups.php 
>
>
>   

-- 
Patrick Durusau
patrick@durusau.net
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)


--- End Message ---


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]