OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cgmo-webcgm message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [cgmo-webcgm] Fwd: WebCGM 2.0 and ISO/IEC 2022 switching


Here's my take on the question, embedded below.  I'd like to hear from other TC members, if you have an opinion (and be sure to copy Jeff).

At 09:45 AM 3/25/2006 -0700, Lofton Henderson wrote:

Date: Fri, 24 Mar 2006 15:23:45 -0800
From: Jeff Wolkenhauer <jeffrey.wolkenhauer@ugs.com>

=======================================
Hello group,

I am hoping I can get some clarification regarding the use of ISO/IEC 2022 switching within non-graphical strings in WebCGM 2.0.

First, my understanding from the PPF is that
  1) ISO/IEC 2022 switching is allowed within SF strings.
  2) The only allowed encodings are ISO Latin1, UTF-8, and UTF-16.
So a CGM generator may switch between any of those three encodings.

Is this correct?

I ask because the phrase in the section T.14.5 of the PPF:
  "Otherwise, the use of ISO 2022 switching
   is prohibited in non-graphical text string."
confused me at first, but it's placement in the text and the other references to allowing 2022 switching imply the meaning of the phrase to be, "Use of character sets other than ISO Latin1, UTF-8, and UTF-16, are prohibited for ISO 2022 switching in non-graphical text strings".

T.14.5 is here:
http://docs.oasis-open.org/webcgm/v2.0/WebCGM20-Profile.html#webcgm_4_3

Quoting the text:
[[[
 The permitted character sets for non-graphical text are ISO Latin 1 (LHS & RHS), and UNICODE UTF-8, and Unicode UTF-16. Only one of these three shall be used throughout any particular WebCGM metafile instance. According to the CGM standard, the default SF character set, at the beginning of the 'metafile id' parameter of the BEGIN METAFILE element is ISO Latin 1. If the metafile is to use UTF-8 for SF parameters, then the following 4-octet ISO 2022 sequence shall occur as the first 4 octets of the 'metafile id' parameter: ...etc... 
]]]

I read this to mean that ISO2022 switching can *only* happen once, at the beginning of the Begin Metafile id string, and that selected SF character set then must pertain for the whole metafile.  (This is the way I have read WebCGM 1.0 as well.


If the answer to my first question is "yes", my second question relates to the scope of the 2022 switch.  Is the character set invoked only for that string (my hope) or for all subsequent SF strings in the rest of the metafile?

I believe that the SF character set is established for all SF strings in the metafile.


For example, if the default ISO Latin is in effect, and a screen tip is to be output in Japanese using UTF-8, I would prepend <ESC 2/5 2/15 4/9> to the string.  But I am hoping that subsequent SF strings can be specified in Latin again - especially since UTF-8 has no return.

I see your use case, but in my view, it is not supported in 2.0, nor in 1.0.

Having said that, I do see one small loophole in the wording of T.14.5.  It says that only one of the three sets is permissible within a metafile.  As worded, it might be arguable that it in fact would not prohibit the repeating of the ISO2022 string **for that set** at other places in the metafile, in addition to the beginning of the id string of BegMet.  But that would be a fairly useless thing to do -- re-declare the same set that's already in use.

It is my belief that CGMO intended that the only occurrence of ISO2022 switching would be at the very beginning of the metafile, at the start of the BegMet string.

Jeff, if you are unable to use the "Join this TC" button on the TC home page, I'll forward any reply of yours to the list, until we sort it out.

Regards,
-Lofton.


In clear text:
BegAPS "Eng_Ja_Howdy" "grobject" StList;
APSAttr 'name' "14 1 'Japanese Greetings'";
APSAttr "screentip" "14 1 '<ESC 2/5 2/15 4/9>...UTF-8 for JP-Greetings'"; %
%
APSAttr "region" "11 1 1            %-- typ:idx, cnt:1, val:1(rect)   --%
                  16 4              %-- typ:vdc, cnt:4, val:(4-pts)   --%
                  520 859 708 914"; %-- Points defining a rect        --%
BegAPSBody;
    RestrText 187 55 (520 859) FINAL
      "Hi";
EndAPS;

So APSAttr _type_ "screentip" is ISO Latin, the screentip itself is UTF-8, and the following APSAttr type "region" is ISO Latin.

Thanks very much for your help with these two questions.

Best Regards,
Jeff Wolkenhauer

UGS The PLM Company        (503.466.9401)
                                       jeffrey.wolkenhauer@ugs.com
                                       15455 NW Greenbrier Pkwy, Suite 210 * Beaverton, OR 97006
--


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]