oiic-formation-discuss message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: Re: [oiic-formation-discuss] My perspective
- From: robert_weir@us.ibm.com
- To: oiic-formation-discuss@lists.oasis-open.org
- Date: Mon, 30 Jun 2008 08:56:13 -0400
Thomas Zander <zander@kde.org> wrote on 06/29/2008
05:34:56 PM:
>
> To give my perspective on how to do conformity checking, I want to
give my
> view on what ODF is. This may sound strange to you as we all
know what ODF
> is, right? But actually I realized that a lot of people have only
a very
> different and often limited set of usecases in mind when they think
about ODF.
> ODF has been created for an office suite, applications that show you
a text
> document or a spreadsheet and that want to save that. This is
a valid
> usecase, but a simple one.
> More exiting usecases are things like:
> * website generates a ods file for download. So if you have a website
that
> gives you access to all your (or your companies) contacts you can
download a
> selection and use that in your spreadsheet or in your word processor
for
> mailmerge.
>
> * ODF combines things like svg and mathml, which means it can be used
as a
> file format for clipart. Meaning can be that your vector graphics
get stored
> in ODF, but also a text-snippet or just a logo. I'll let you come
up with
> usecases yourself, but there are plenty ;)
>
> * Currently the format of choice when doing rich text is html. So,
if I
> copy/paste or I email, html is created. This is sub-optimal. Html
isa broken
> format on many levels and has various security problems as well. Much
better
> would be to use ODF as a clipboard or email format. Usecases range
from
> having the ability to copy paste all text you have on your desktop
(all text
> entry fields) as odf so simple annotations but also things like bold
will
> survive copy paste. Sending emails as ODF xml streams is something
I think
> will happen in under 5 years.
>
These are good points to be reminded of, especially
when we mention interoperability. We sometimes risk dwelling too
much on person-to-person exchange of documents, and defining interoperability
purely with respect to that use case. But as a free, open, XML-based
standard, ODF lends itself to server-based processing and other modes of
use. For server-to-server use, say indexing for a search engine,
information extraction, combining documents, report generation, etc., the
UI rendering is totally irrelevant. In these cases the most important
factors for interoperability are validity (from schema perspective) and
the simplicity of the data and meta data models.
> This short list of different fields of use show that there are a lotof
things
> to consider when looking into the issue of interoperability. For instance,
do
> we require copying text from a full-blown word processor and pasting
that
> into a simple text field to preserve bold/italic data when at a later
point I
> copy that same text again and paste it into my rich-text editor in
my email
> application. The textfield would not be able to show this data,
so copying
> it out later again seems a bit odd.
>
> So, the point I'm trying to make here is that if we want to have ODF
working
> across a large range of usecases having a simple metric of rendering
or of
> preserving doesn't make much sense. It would likely just hamper uptake
since
> the most exciting usecases would not be able to claim ODF compliance.
>
It is a balancing act. In a sense, the ODF TC
can define conformance however it wants. We can have a very loose
definition that makes many applications conformant. Or we can have
a very strict definition that no existing ODF application can pass. I
don't think it makes sense to define conformance for ODF to be such that
only heavy-weight, traditional desktop editors can claim conformance. Doing
so would risk leaving out the most interesting and vibrant part of the
market today.
> What do I think we want to test? There are some lessons I learned
> from working
> on the ODF testsuite;
> * I want to have a way to test each feature in the ODF specification.
To the
> level of each element and each value that you can give.
OK. We've been calling such tests "atomic
tests" since they test at the level of individual features of the
ODF standard.
> Now, what specifies a passing test should be separated into several
points,
> since applications typically can pass one and fail another. I
suggest these
> points to be something along the lines of
> a) Loading the test data and displaying it on screen (correctly
;)
> b) Saving the loaded document out again and not loosing information.
> c) Having GUI to alter the ODF feature to all the supported
values.
>
So you are starting with a document that is already
correct (well-formed, valid, etc.) and checking how an ODF application
treats it.
We've been describing such tests as falling into one
or two buckets:
1) Conformance tests would be tests that are traceable
to formal provisions of the ODF standard. So they are things that
are testable relative to a specific shall/should/may, etc., in the text
of the standard. Violations of requirements (shall's) would be errors
and violations of recommendations (should's) would be warnings.
2)Interoperability tests would contain further tests
which might not be formally traceable to a shall or a should, but are stated
as definitions in the text, or are clearly implied. This would be
a judgement call of the proposed OIIC TC. For these items we probably
do not want to score them as passing or failing, but as suggestions, based
on the consensus of the TC. To the extent ODF implementors run these
interoperability tests and adjust their implementations based on this,
then we will improve interoperability.
There has also been talk about an "Acid"
test for ODF, similar to the CSS2 Acid test, that would test a number of
features together and give a properly formatted image if the tests all
passed. There was a prototype posted a while ago using spreadsheet,
where each cell tested a formula and used conditional formatting to display
a smiley face if the formulas all tested correctly. I see this as
a different way of presenting the results of a conformance or an interoperability
test, though not a different category of test.
> As an example; an implementation could be able to have a list-item
type
> of "Arrow" but when it loads it it silently converts the
list type to a
> unicode character and on saving the list type is no longer arrow,
but
> character. So the first test would pass, but the second would
not.
>
> This brings up an important problem; when an implementation does notsupport
a
> certain feature at all, does that mean loading a document and saving
it out
> again will loose those features? I think this is an important
part of our
> interoperability question.
> To answer this question we have to make an important distinction
> between known
> ODF features and unknown ODF features. A known feature is something
that is
> detailed in the specification, but this implementation does not support.
For
> instance because its text rendering engine is not powerful enough.
> Completely separate from this is unknown metadata or plain foreign
tags. For
> example an ODf implementation may add some new feature that is not
(yet)
> supported by ODF and it saves it in its own namespace. This new feature
is
> not possible to support for most other applications, but it may save
the tag
> out again.
>
This is a useful distinction. We have a name
for when an application drops data. It is called "data loss".
Although we don't call this out in the ODF standard, if an application
does not support, say, footnotes, and then silently eliminates footnotes
from documents that it edits, then it is causing data loss.
But on the other hand, in some cases you want data
loss. For example, there is a utility that scans ODF presentations
and searches for embedded bitmap images, and if they are at a a very high
resolution reduces down samples them to something more reasonable for screen
presentations, this reducing the size of the ODP file. This is data
loss, since the utility has thrown away information. Similarly, MS
Office has a mode where it scans a document for user metadata (author,
minutes spent editing, etc.) and deletes that. This is data loss,
though useful and intentional.
So when defining conformance, we need to be sure that
we don't define it so that we define-away useful applications. If
something can only be judged in context by a user's intent, then it probably
doesn't belong in a conformance clause.
> So, if anyone asks if an ODF application has round-trip preservation
of
> properties I want the first counter question to be if this is about
known or
> foreign properties ;)
> Each of those two categories should have the 3 questions (a, b &
c) answered.
> This matrix of checkboxes totally makes up for feature-support.
>
> For interoperability the above will get you a long way, but there
are lots of
> implementation details that may not be covered by the feature matrix.
One
> good example is the basic of linebreaking. See
> http://www.kdedevelopers.org/node/2262
for some research I did on this topic
> in the past (sorry, image links broken)
> The correct typographical (in case of text) or otherwise correct
> displaying of
> a certain concept warrents a separate set of tests.
>
So long as that is in the "interoperability test"
bucket, then I'm OK with this. But we need to keep the "conformance
test" bucket so it only contains tests for things that are formal
requirements of the ODF standard.
>
> Conformance testing, why ? And how?
>
> Up until now I have talked mostly about the concept of testing and
what to
> test. I simply skipped over the 'why' question. Which may be
something
> people have not come up with a good answer to yet.
> The simple answer is that testing means nothing more then the process
towards
> creating a good implementation.
> The more complex answer is because it is good for interoperability;
> it creates
> something to aim for and it means the more experienced people get
to point
> out common pitfalls to the newcomers. But end users can also find
out what
> support another implementation has and that in and off itself means
we enable
> market forces. The best implementation will get new users faster.
>
> With several answers to the 'why' this gives us some insight into
the format.
> I heard people say we need profiles. I personally think that
profiles sounds
> wrong; what I think is really important is feature-groups. Does
> implementation X support lists or tables very well. If not,
I won'tcheck it
> out. If I need to write my formula, I go for the app that supports
80% of
> the spec instead of the one with 20% support. Etc.
>
I think of a profile (or at least one kind of a profile)
as a specification that records a set of "feature-groups" that
together solve a recurring problem. For example, any web-based word
processor will have the common problem that their toolkit, the set of text
and graphical operations that they can render, is limited to that expressible
by HTML/CSS2. So when using ODF, every web-based word processor will
have similar things that they will have difficulties with. However,
we can define an "ODF/Web" profile that defines exactly what
subset of the ODF standard is losslessly mappable to the web toolkit, and
by agreeing on this profile, we can achieve much greater interoperability
between web-based word processors. At the same time we allow traditional
desktop word processors to have a "Save As ODF/Web" option.
Such a profile becomes a standard in itself, with
its own definition of conformance, and may be followed by its own test
suite.
> After having gone over what to test, why to test and how to present
the
> findings there is the silly question of how to do it. I think
this is more
> something for future mails on the actually created TC, but I can at
least
> point out my experience.
> The easiest way is to create a set of little documents, but I am not
100%
> convinced it works (without modifications). The reason it didn't work
is
> because we ended up with different people interpreting the tests on
screen
> differently. For example loading a doc which turned on a feature
didn't load
> correctly in implementation A, one author came and said the test should
be
> marked as passed because all the user had to do is go to the menu
and
> manually turn on that feature. Naturally I disagreed, it was the loading
I
> tested, and that didn't work.
>
> Another approach I'm working on now seems to work pretty well, but
is not
> really easy to test. I explain it here;
> http://labs.trolltech.com/blogs/2008/06/11/testing-typography/
> Basically, you need a common set of documents, as before, but you
> additionally
> need a set of automated tests to judge the outcome. It has to
be automated
> so it can be run every week (or day!) instead of once and then never
again.
> The biggest problem with the this is that for each implementation
you need
> money to implement it. And someone that knows what he is talking
about to
> actually approve the tests.
>
> So, bottom line, there is no silver bullet and any progress in this
area is
> welcome. I've been working for some two years on making sure the KOffice2
> implementation will be the best ODF implementation there is, and I
do realize
> that conformance testing is an essential part of this process.
>
> Thanks for reading till the end, flames and thank-you notes welcome
:)
> --
Thanks for your thoughts.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]