OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

cgmo-webcgm message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [cgmo-webcgm] questions about the Defect Report

[...Replying to Forrest and TC list...]

At 08:31 AM 5/31/2007 -0500, Forrest Carpenter wrote:
It would seem simpler and easer and faster to change the CGM standard to allow v4 application structure elements to occur in the TOS and define the restrictions we need in the Web2.1 profile.

The challenge, as I see it, is to define the fixes to the CGM standard in such a way that they will pass rigorous scrutiny by SC24.  By "rigorous" I mean formally sound, as opposed to simply implementable.

Those 6 examples show the difficulty with the formal specification, because of the odd way that CGM:1987 was designed (RT, AT, etc).

For example, allowing aps elements to "occur in the TOS" does not cover the case of tagging the initial substring of a RT..AT..AT.. sequence.  Because then the beginAPS is occurring in POS, and the endAPS is occurring in TOS.  It is this kind of nastiness of formal specification, even in one of our mainline use cases, that I fear SC24 might dislike.

One possibility is to prohibit those edge cases -- tagging the initial fragment or the final fragment.  The effect of tagging initial and final fragment could still be achieved as follows: 

Initial fragment
RT (x,y) (extent) (0 chars) (notfinal)
   AT (n chars) (notfinal)
AT (m chars) (final)

Final fragment
RT (x,y) (extent) (n chars) (notfinal)
  AT (m chars) (notfinal)
AT (0 chars) (final)

Essentially, you can only tag a non-final AT fragment.  In both these cases, <bas> occurs in TOS, and </bas> returns you to TOS.  (We still have the issue that AT occurs in SOS, but I think that might be tractable.) 

I think this could be relatively straightforward to define in the formalisms, the State Table and the EBNF.  In other words, the SC24 folks might be satisfied with it.

Yes, it is a little annoying, but the root cause (probably unavoidable) is because of the non-OO way that CGM:1987 defined compound text.  (The proposal emulates a more OO compound text element, with "RT (0 chars) notfinal" supplying the <begin-compound-text> when needed and the "AT (0 chars) (final)" supplying the <end...> when needed.)

Would vendors be agreeable to build their products that way, if it were specified that way in a defect correction (and incorporated into WebCGM 2.1)?


From: Lofton Henderson [mailto:lofton@rockynet.com]
Sent: Wednesday, May 30, 2007 5:05 PM
To: cgmo-webcgm@lists.oasis-open.org
Subject: RE: [cgmo-webcgm] questions about the Defect Report


Thanks for the feedback Forrest.

I have been doing some more detailed technical stewing over this.  I suggest everyone have a careful read & think.  This might be a topic for next Wed telecon.

At 08:53 AM 5/30/2007 -0500, Forrest Carpenter wrote:

I agree with Lofton, a bas should only be allowed within not-final text if
the eas is contained within the not-final text or immediately following the
final text. An eas should only be allowed within not-final text if the bas
is within the not-final text or immediately preceding the RT that begins the

We all have a clear notion of what we want to achieve -- to simply APS-tag a partial text string, subject to restrictions like those, with no "other-graphical-stuff" allowed within the defined APS.

To simplify the state table this restriction could be defined in the
Web2.1 profile.

Here is what I'm struggling with.  Altho' we can define the concept very simply and directly (with words, at least), in order for it to fly within SC24 as a defect correction, I suspect that we will need to define it precisely in terms of CGM:1999's formalisms:  the normative State Model (sec.6, tbl.8), and the normative EBNF of annex H.

Here is what I'm struggling with now.  I took the six examples that we agree are desirable, that have strong use-case support, and that should have been allowed in CGM:1999 (V4).   Below I changed <eas> to </bas>, for clearer illustration. 

Below each progression, I tried to identify the state transitions that occur after each element.  There are places where I just can't figure out what to do.  For example...

AT(notfin) can only occur in TOS, and doesn't cause a state transition.  And </bas> would be expected to "pop" the state that was in effect at <bas>.  See below examples.  (You need to look at the following in mono-spaced type, which is how I have formatted it.)

..<bas> RT(notfin) ... AT(notfin) ... AT(fin) </bas>......

..<bas> RT(notfin) ... </bas> AT(notfin) ... AT(fin)

..<bas> RT(notfin) ... AT(notfin) <eas> ... AT(fin)

..RT(notfin) ... <bas> AT(notfin) </bas> ... AT(fin)

..RT(notfin) ... <bas> AT(notfin) ... AT(fin) </bas>

..RT(notfin) ... AT(notfin) ... <bas> AT(fin) <eas>

We're facing the problem that, conceptually, we have sequences such as:

<bas> <RT> </bas> </RT>

This in turn could be seen as deriving from the fact that V1 CGM defined things like RT and AT in an awkward, non-OO way.  What it really should have done (in 1987?!) is define a proper Compound Text structure:

<begin-Compound-Text  (x,y) (extent)>
  <text-fragment (string)>
  <text-fragment (string)>
  <text-fragment (string)>

If it were so defined, then all of the partial-text tagging could be cleanly defined (and the state model rules easily written down), e.g....

<begin-Compound-Text  (x,y) extent>

It is a vexing problem -- how to express what we know should happen (that we can easily described with words) in the formalisms of the ISO CGM document.  But IMO we must sort it out, if we are to succeed with a SC24 defect correction.

I think we will have better luck with the EBNF formalism (even tho' we still have to resolve the State Model conundrum).  Currently the EBNF says...

<BAS>  <picture element>* </BAS>

This could become...

<BAS>  <picture element>*  | <partial-text-element>  </BAS>

and then we could proceed like this...

<partial-text-element> ::= <initial> | <middle> | <terminal>
<initial> ::= <introducer> <not-final-fragment>+
<middle> ::= <not-final-fragment>+
<terminal> ::= <not-final-fragment>* <final-fragment>

I think we can continue to refine that down to acceptable terminal productions, well defined, etc.  (Altho' I haven't done it yet.)

Anyone have thoughts or contributions on this, or ideas how to deal with it?  Is an ISO defect correction (for partial-text tagging) actually going to be our best strategy to get the functionality into 2.1?

(Btw ... playing devil's advocate, I could also imagine ISO folks asking, "...why not also allow partial-closed-figure tagging, and partial-tile-array tagging?"  Answer:  we have no strong use case like partial-text tagging.)


-----Original Message-----
From: Lofton Henderson [mailto:lofton@rockynet.com]
Sent: Tuesday, May 29, 2007 6:36 PM
To: cgmo-webcgm@lists.oasis-open.org
Subject: [cgmo-webcgm] questions about the Defect Report


Okay, here are some gritty technical details and issues about the potential
Defect Report for partial-text/APS.  Feedback invited.

Preliminarily, part 1 of CGM:1999 will be affected in these places:

** State Table, table 8, beginning on p.104
** Annex H, formal grammar of v4 metafiles
** (maybe) 7.6.4 - 7.6.6, descriptions of Text, RT, AT

Let me note that there is already an error in Table 8, p.105:  Append Text
is not allowed in SOS.  So according to that error...

legal:  <bas>  RT (final)  <eas>
illegal:  <bas>  RT (notfinal) ... AT (final) <eas>

[Notation:  <bas> is Begin APS Structure, <eas> is End APS Structure, RT is
Restricted Text element, etc]

That was certainly unintended.

It looks like it could be tedious to work Table 8 and Annex H to say
exactly what we wanted.  First we have to specify exactly what we wanted...

Conceptually, the following cases clearly should be legal (as well as
similar ones substituting Text instead of RT) in V4 CGM:1999.  I.e., the
valuable use case that V4 unintentionally threw out is to allow an
"attributed partial text string" to be an Application Structure.

<bas>  RT(notfinal) ... AT(notfinal) ... AT(final) <eas>
<bas>  RT(notfinal) ... <eas> AT(notfinal) ... AT(final)
<bas>  RT(notfinal) ... AT(notfinal) <eas> ... AT(final)
RT(notfinal) ... <bas> AT(notfinal) <eas> ... AT(final)
RT(notfinal) ... <bas> AT(notfinal)  ... AT(final) <eas>
RT(notfinal) ... AT(notfinal)  ... <bas> AT(final) <eas>

What about cases like the following?

<bas>  [some-other-graphical-stuff]  ... RT(notfinal) ... <eas>
AT(notfinal) ... AT(final)
RT(notfinal) ... <bas> AT(notfinal)  ... AT(final) ...
[some-other-graphical-stuff] <eas>

It is my sense that these are ugly without commensurate value.  So should
they be excluded?

Making explicit the inclusion/exclusion of those ugly cases is what will
make it tedious to detail the fixes to Annex H, especially.  I think the
fixes will be easier to specify cleanly if we *exclude* the ugly cases.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]