OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Re: [office-metadata] Suggested Changes on the Metadataproposal

I think you're reading too much into the IETF's definition of MAY.  It explicitly says that a vendor is permitted to omit the item, though it must accommodate itself and degrade functionality as necessary.  What is not permitted is that the application utterly crash when presented with an item it does not understand.   At least that is the way it works for the IETF standards I'm familiar with.

Although intuitively we want to say, "Preserve metadata unless the user explicitly intended otherwise," I don't see how to express this in standards terms.  We can't have a conformance depend on "user intent".  And reference to a user doesn't help. Documents can be processed by automation, and I think we would equally be unhappy if metadata were arbitrarily stripped there.  In any case, I think we need to work along the lines of "shall be capable of" or "shall allow at least one mode of operation where" or something like that.   That would be testable.  

You suggested that a devious implementation might makes this mode of operation hard to find in order to hurt interoperability.   But then I could also suggest a devious user who arbitrarily deletes metadata in order to hurt interoperabiity.  I'm not sure a document format standard can prevent either.  


marbux <marbux@gmail.com> wrote on 07/01/2007 06:31:11 PM:


> On 7/1/07, robert_weir@us.ibm.com <robert_weir@us.ibm.com> wrote:
> I suppose I should throw in my $.02.
> First, we should remember that ODF mandates behavior at several
> levels.  The schema itself encodes requirements in terms of what
> elements or attributes are optional or mandatory, what nesting is
> permitted, what restrictions there are on data types, etc.   And
> then the normative text of the standard, along with external
> normative references, make additional provisions by the use of
> "shall" and "shall not".  

> But virtually all are undercut by the following sentence in the
> conformance section:
> "There are ***no rules regarding the elements and attributes that
> actually have to be supported by conforming applications,*** except
> that applications should not use foreign elements and attributes for
> features defined in the OpenDocument schema."

> But note that in that case,the provision is only applicable to those
> who implement that feature.  A "shall" concerning the calculation of
> the SUM() spreadsheet function may be totally ignored by someone who
> is implementing a word processor only.  Finally, we have the
> conformance clause, that defines which features and additional
> constraints are required for conformance with the standard.
> Today our conformance clause designates requirements for conformant
> documents, conformant applications that read, conformant
> applications that write, and conformant applications that both read
> and write.  

> We have very few conformance *requirements,* in the sense of
> mandatory requirements. Here is the sum total:
> >>>
> Documents that conform to the OpenDocument specification may contain
> elements and attributes not specified within the OpenDocument
> schema. Such elements and attributes **must not** be part of a
> namespace that is defined within this specification and are called
> foreign elements and attributes.
> ...
> Conforming applications either **shall** read documents that are
> valid against the OpenDocument schema if all foreign elements and
> attributes are removed before validation takes place, or **shall**
> write documents that are valid against the OpenDocument schema if
> all foreign elements and attributes are removed before validation takes place.
> ...
> Foreign elements may have an office:process-content attribute
> attached that has the value true or false. If the attribute's value is true
> , or if the attribute does not exist, the element's content should
> be processed by conforming applications. Otherwise conforming applications
> should not process the element's content, but may only preserve its
> content. If the element's content should be processed, the document itself ***
> shall*** be valid against the OpenDocument schema if the unknown
> element is replaced with its content only.

> Conforming applications ***shall*** read documents containing
> processing instructions and should preserve them.
> <<<
> We should also realize that all of those "may" and "optional"
> requirements keywords changed their meaning between ODF 1.0 and 1.1.
> In ODF 1.0, they meant:
> >>>
> 5. MAY   This word, or the adjective "OPTIONAL", mean that an item
> is truly optional.  One vendor may choose to include the item
> because a particular marketplace requires it or because the vendor
> feels that it enhances the product while another vendor may omit the
> same item. An implementation which does not include a particular
> option MUST be prepared to interoperate with another implementation
> which does include the option, though perhaps with reduced
> functionality. In the same vein an implementation which does include
> a particular option MUST be prepared to interoperate with another
> implementation which does not include the option (except, of course,
> for the feature the option provides.)
> <http://www.ietf.org/rfc/rfc2119.txt>. This is the definition used
> by nearly all OASIS standards.
> <<<
> At ISO's request, that definition changed to:
> >>>
> The verbal forms shown in Table G.3 shall be used to indicate a
> course of action permissible
> within the limits of the document.
> Table G.3 — Permission
>     Verbal form
>     Equivalent expressions for use in exceptional cases
>     (see
> may
>     is permitted
>     is allowed
>     is permissible
> need not
>     it is not required that
>     no … is required
> Do not use "possible" or "impossible" in this context.
> Do not use "can" instead of "may" in this context.
> NOTE 1
> "May" signifies permission expressed by the  document, whereas "can"
> refers to the ability of a user of the document or to a possibility
> open to him/her.
> NOTE 2
> The French verb "pouvoir" can indicate both permission and possibility.
> For clarity, the use of other expressions is advisable if otherwise
> there is a risk of misunderstanding.
> <<<
> <
> ch/tiss/iec/Directives-Part2-Ed4.pdf+nnex+H+of+%
> 5BISO/IEC+Directives&hl=en&ct=clnk&cd=1&gl=us >, pg. 62.
> So in ODF 1.0 the keywords "may" and "optional" imported a
> requirement of interoperability. In ODF 1.1, that requirement
> disappeared with the stroke of a pen. My reading of the ISO
> directives suggests that we do not have the option of going back to
> the RFC 2119 definitions. But nonetheless it is my understanding
> that the TC did not study the impact of the change in requirements
> keyword definitions before making the change.
> For example, the use of the word "may" in the preservation of
> foreign elements and attributes section would at least arguably,
> under the RFC 2119 definition, **require** preservation of foreign
> elements and attributes needed for interoperability purposes whether
> or not an application supported foreign elements and attributes.
> But I think it might fly with ISO to use the RFC 2119 definition of
> "may" and "optional" in the conformance section alone and that might
> put us further down the road toward interoperability.

> As you may already know, OASIS has added a new requirement for all
> OASIS standards:
> "A specification that is approved by the TC at the Public Review
> Draft, Committee Specification or OASIS Standard level must include
> a separate section, listing a set of numbered conformance clauses,
> to which any implementation of the specification must adhere in
> order to claim conformance to the specification (or any optional
> portion thereof) "

> I think thisis particularly important because procurement officers
> want to be able to simply specify that a candidate application must
> produce conformant format X. They do not want to, in effect, have to
> write their own file format specifications

> When we make the changes required for the new OASIS rules, I suggest
> we think about conformance in general, and consider making a more
> substantial statement. For example, we could define things at a more
> granular level:  a conformant ODF spreadsheet shall support
> workbooks of at least a single sheet, with at least 100 rows and 25
> columns and at least the Group 1 spreadsheet functions.  (Just an
> example, not a real proposal).  So we have the opportunity to
> specify multiple levels of conformance, either in the main text, or
> as separate profiles.

> +1. I'd add that we should approach such issues with suspicion that
> every option is a potential interoperability breakpoint.

> To the specific question at hand, I am concerned with the loose use
> of the word "preserve."  What exactly does that mean?  For example,
> must the xml:id's of the saved document be lexically identical to
> the read document?  Or are looser version of equivalence allowed?  
> For example, if the id originally is "foo" and then it is saved with
> the id "bar" is that permitted, provided that the structure and
> referential integrity of the id and references are maintained?  
> Remember, it will be common for an application to read an XML
> document and convert id's and links into internal runtime
> representations that are not at all similar to the XML.  
> Id/references might be converted into C-language pointer references
> between objects, etc.  Then when writing out the document, new
> unique ID's might be generated on-the-fly, perhaps in sequential
> order.  This might vary from implementation to implementation.  
> Beyond referential integrity, I don't know if there is any
> additional value in saying that a document created in KOffice must
> have identical ID labels when that document is later saved in OpenOffice.  

> I do not have the technical knowledge to answer that question.
> However, I request that we approach the issue from recognition that
> a document may pass through many applications before wending its way
> back  to the originating application. From a layman's view, it would
> seem that a shifting vocabulary would interfere with
> interoperability mightily in situations where it is unknown what
> application will be the next to process a document.  

> We should also note that it is a feature of some programs, such as
> Office 2007, to have a menu item specifically for removing metadata
> from a document, for privacy and security reasons.  I don't think we
> want to prevent such an application from claiming conformance.

> Wouldn't an exception for user initiated actions cover this situation?

> So we need to be need to be very careful how we word this.  Perhaps
> something like "Conforming applications that read and write
> documents shall be capable of "preserving" xml:id's, etc."  With the
> proviso that "preserving" needs a better definition, this ensures
> that conforming applications support preservation, while also
> allowing that not every mode of use may actually do so, such as when
> a user deletes content or metadata, etc.

> I'm not sure that "capable" helps a lot. E.g., if an application is
> capable of preserving metadata but ships with that option turned off
> and an arcane set of keystrokes to enable the option known only to
> the developers, the app is still "capable" of preserving metadata.
> Maybe call that an Easter Egg optional setting.

> While on the subject of the conformance section and requirements
> keywords, we have another problem to deal with. The Notation section
> currently reads: "

> Within this specification, the key words "shall", "shall not", " should", "
> should not" and "may" are to be interpreted as described in Annex H
> of [ISO/IEC Directives] ***if they appear in bold letters.***
> Between ODF 1.0 and ODF 1.1, many of the keywords lost their
> boldfacing. I suspect that is because we tend to bat language back
> and forth in plain text email, which strips text attributes.

> 1. We could avoid much of that kind of problem in the future if we
> switched to keywords in all cap rather than bold face, since they
> will remain all caps in emails.

> 2. Does anyone know if their are any instances of the keywords that
> should ***not*** be boldfaced (or all caps)? If not, we have a
> simple global search and replace task. If so, we have a tedious
> review ahead of us.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]