OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [office] Unicode relative spaces -- Was,proposal for new position and space attributes for the list level


I am not entirely sure as to what is being proposed here. First of all, 
there is no problem with using the unicode *-space characters in 
OpenDocument text content. It is of course the responsibility of the 
application displaying the document to render those correctly.

At what places in ODF exactly would you want to support horizontal 
measurements based on the font-size and what would be the 
reference-size? Are you going to implement this or is anyone else going to?

Best Regrads,

marbux wrote:
> On 2/19/07, Oliver-Rainer Wittmann - Software Engineer - Sun
> Microsystems <Oliver-Rainer.Wittmann@sun.com> wrote:
>> Ok. Thus, we arrive at the conclusion that we have more than one
>> view/opinion on my proposal - that's somehow natural. But, it doesn't
>> seem that anything is unclear in my proposal.
>> Let's vote on it.
> Might we delay a bit on voting? This also has ramifications for ODF
> 1.1 section 4.1.1 (Headings).
> <http://develop.opendocumentfellowship.org/spec/?page=4#4.1>. Also,
> I've been working on a proposal that might impact how horizontal
> spaces are handled in lists. I had hoped to put in another day on it,
> but y'all have forced my hand. :-)
> The ODF specification currently lacks support for the Unicode em-based
> relative typographic spaces. Adding such support to the specification
> is one of my goals. At the very least, I hope we might avoid defining
> lists in a way that interferes with later adoption of such support.
> ODF 1.1 section 16.1 currently provides in relevant part:
> ==============
> length
> A (positive or negative) physical length, consisting of magnitude and
> unit, in  conformance with §5.9.11 of [XSL:FO]. Supported units are
> „cm", „mm", „in", „pt" and „pc". Applications *shall* support all
> these units. Applications *may* also support "px" (pixel). Where the
> description of an attribute explicitly states that pixel lengths are
> supported, applications  should support them.
> Examples for valid lengths are "2.54cm" and "1in".
> ===============
> So we apparently have only absolute and margin percentage positions
> for horizontal measurements currently supported in the spec. That
> seems to me as though it could lead to issues in transformations.
> XSL:FO section calls for use of the em-based system as the
> unit of measurement for relative units, but despite the citation to
> that section in the ODF spec, ODF does not reflect XSL:FO in that
> regard. See <http://www.w3.org/TR/2006/PR-xsl11-20061006/#d0e5490>.
> That is a major wart in ODF from my perspective as a former
> typographer before a mid-life career change. The em-based scalable
> typographical widths are more than 500 years old and are incorporated
> in virtually all digital type faces in any human language.
> The em-based relative units of measurement are defined in the Unicode
> standard, <http://www.unicode.org/charts/PDF/U2000.pdf>, pg. 167. (But
> ignore the em and en quads at least for now; they are, I believe,
> obsolete.) There is a better compilation of them that adds some listed
> in other places here,
> <http://www.cs.tut.fi/~jkorpela/chars/spaces.html>. But as I
> understand it, the em, en, and the various thin spaces, although
> identified as Unicode characters, have to be implemented at the
> application level; they are not included as characters in digital
> typefaces in general usage since they are non-visible.
> Adding tags for the typographic spaces to the ODF and ODF applications
> repertoire should enhance ODF's compatibility not only with XSL:FO,
> but also with CSS. Use of the typographic spaces is recommended by the
> W3C Web Content Accessibility Guidelines Working Group in their best
> practices guide for CSS stylesheets.
> <http://www.w3.org/WAI/GL/css2em.htm>. That page recommends that type
> sizes and horizontal spaces be expressed in CSS stylesheets in ems
> (scalable) rather than using percentages or absolute measurements
> unless there is a specific requirement for non-scalable widths.
> Microsoft's take on implementing these spaces is here.
> <http://www.microsoft.com/typography/developers/fdsspec/spaces.htm>.
> The Unicode em, en, and thin space are also supported in HTML.
> <!ENTITY ensp    CDATA "&#8194;" -- en space, U+2002 ISOpub -->
> <!ENTITY emsp    CDATA "&#8195;" -- em space, U+2003 ISOpub -->
> <!ENTITY thinsp  CDATA "&#8201;" -- thin space, U+2009 ISOpub -->
> <http://www.w3.org/TR/html401/sgml/entities.html#h-24.4>.
> Here is a bit of explanation about the em-based typographic spaces in
> case you are not familiar with them. They are basically tools for
> horizontal alignment of visible characters. I was a typographer for
> some 20 years before a career change to the law so this is second
> nature to me. But the em system is simple to learn and work with.
> The em space is the square of the typeface's point size, exclusive of
> any variation in vertical leading or linespacing.  So the em in eight
> point type is eight points wide and the em in 24 point type is 24
> points wide. See XSL:FO: section 5.9.72,
> <http://www.w3.org/TR/xsl/#relative.lengths>.
> The en space is half the width of the em, so in eight point type would
> be 4 points wide and in 24 point type would be 12 points wide. The
> various thin space widths are calculated as fractions of the em,
> generally corresponding to the width of various punctuation marks. The
> Unicode spaces also include the "punctuation space," which corresponds
> to the width of the more common punctuation marks.
> The typographic spaces scale when type sizes are changed, for example,
> when a person with visual disabilities uses an enlarged typeface for
> viewing an on line web page. The em system should fit well with a rich
> file format designed with transformations of text to other formats in
> mind, since the spaces scale automatically if there is a change in the
> type's point size; i.e., there is no need to revisit horizontal white
> space specified in ems during transformations, unlike white space
> identified by fixed measurements.
> The first big trick to understanding the em system of measurements is
> to keep in mind that in type faces with variable widths, not all
> characters are variable in width. Many characters' widths are tied in
> a standardized way to the widths of the em, the en, and the various
> thin spaces. For example, in Latin 1 typefaces, the em dash and em
> leader are both one em wide. Numbers, currency symbols, the slashes,
> the plus and minus signs, the en dash, the left and right double
> quotes are all one en wide. The Unicode punctuation space character
> corresponds to the width of the typeface's period, comma, semicolon,
> exclamation point, and hyphen. The width of the em is unaffected by
> variations in the type face, e.g., the em stays 8 points wide in both
> 8 point Stymie light and 8 point Stymie extra bold.
> (There at least used to be some variability in typefaces on the width
> of the hyphen, with some typefaces making them 1/5 em wide and some
> 1/6 em wide. I haven't looked yet to see if that has been standardized
> more recently although I suspect they have; witness HTML's support for
> only one size of thin space. The equivalencies I list above are
> examples only; the list is not comprehensive.)
> The second big trick to understanding the system is to recognize that
> the typographical spaces were expressly designed as a method of
> horizontally aligning text. Before the advent of digital tabs, tables,
> and the like, they were the standard method of aligning columnar
> matter within consecutive lines of justified type, except in the
> typewriter world where tab stops were used.
> So, for example if you are setting 8 point type using a en-width
> bullet padded on the right with an en space, the indent for subsequent
> lines in the paragraph would be one em. (I'm ignoring the left
> indentation applied to all list items.) If the number were followed by
> a period, the subsequent lines would be one em space plus a
> punctuation space. If the type is reformatted as 12 point, the
> relative measurements still apply without resetting
> <text:space-before> or <text:space-after>.
> A more complex horizontal alignment issue is presented by multiple
> columnar matter. E.g.,
> [TextA]  $ 1,037       E    824
> [TextB]       17             15
> [TextC]       --             --
> [TextD]   18,046         14,074
> (The "E" is my substitute for the Euro currency symbol and I've made
> no attempt to provide actual currency conversion values.) Assuming the
> following tags:
> <justify-to-fill> = point where variable space is inserted to fill the line
> <emsp> = em space width
> <ensp>> = en space width
> <punctsp> thin space corrresponding to width of a period (and comma) [1]
> <emdash> a dash character one em wide.
> The following example is how alignment of columnar matter had to be
> done in the hot type days of typography. Working with the above
> example of columnar matter, marking up to left align the left column
> and right align the center and right columns (and ignoring paragraph
> ending tags) we would have:
> [TextA]<justify-to-fill>$<emsp>1,037<emsp><emsp><emsp>E<emsp><punctsp>824
> [TextB]<justify-to-fill>17<emsp><emsp><emsp><emsp><punctsp>15
> [TextC]<justify-to-fill><emdash><emsp><emsp><emsp><emsp><punctsp><emdash>
> [TextD]<justify-to-fill>18,046<emsp><emsp><emsp><ensp>>14,074
> So with that loosening up of the mind muscles, :-) we can now turn to
> the problem of the numbered paragraph in ODF. For indented paragraphs
> the following marked up text would produce uniformly indented right
> aligned paragraph numbers and a left aligned first text character in
> each paragraph whether text is set to left justified or full
> justified:
> ======================
> <emsp><emsp>1.<emsp>Fourscore and seven years ago, our forefathers
> brought forth on this continent a new nation, conceived in liberty,
> and dedicated to the proposition that all ideals tend to degenerate in
> practice.
> ...
> <emsp><ensp>>15.<emsp>Now is the time for all good men to come to the
> aid of the party of their choice.
> ...
> <emsp>142.<emsp>Ninety-nine bottles of beer on the wall, ninety-nine
> bottles of beer. Take one down and pass it around, ninety-eight
> bottles of beer on the wall.
> ======================
> And that relative spacing would survive cross-application
> transformation to other type sizes or faces.
> Now consider what happens if tab stops are set by em-width positions
> rather than by absolute measurements or margin percentages.  Suddenly,
> you have lists whose indentations for subsequent lines survive
> transformation to other type sizes as well, without an algorithm for
> tweaking the tab settings.
> Beyond list issues, support for the Unicode typographic spaces in ODF
> applications would bring ODF into conformance with the major
> publishing stylesheets that require their use, e.g., in paragraph
> indentations, and separation of the em dash from surrounding text by
> hair thin spaces.
> It would also allow applications to offer more flexibility to users.
> For example, an "insert leaders to fill line" feature. Currently in
> most (all?) word processors, there is no way to create lines with dot
> leaders aligned and separated by a user-specified interval other than
> by setting aligning tab stops. E.g., if a user wanted to to create
> fully justified lines such as the following (but with the dot leaders,
> currency symbol, and the right column properly aligned horizontally):
> Bolts .  .  .  .  .  .  .  .  . $     1.09
> Bolt cutters   .  .  .  .  .  .      15.99
> Surgical bolt removers  .  .  .  10,999.99
> Where it gets nasty in current word processors with no work-around is
> in automatically generated page indexes. Users are simply offered no
> option to specify leaders separated by uniform spaces with the dot
> leaders aligned. Instead, they get only an ugly, solid mass of dot
> leaders that visually overpower text, unseparated by spaces. E.g.,
> I.   Summary 
> .............................................................    1
> II.  Ecma 376 Is Less Than Completely Open 
> ...............................   17
>     A. About those 'compatible' but unspecified binary formats 
> ..........   19
>     B. Miscellaneous Ecma 376 warts 
> ..................................... 5082
> III. Poppa Sang Bass; Momma Played Fiddle 
> ................................ 6040
> The difference is visually profound. What **can not** currently be
> done automatically is what is generally recognized as good
> typographical layout. What **can** currently be done automatically
> fails the final exam in Typographic Layout 101.
> I can not point to an application developer who currently would
> support the Unicode relative typographic spaces. But I hope that we
> might lay the groundwork for the future by adding tags for those
> spaces. Or we might at least try to avoid specification of lists that
> would interfere with such tags' later adoption. I'll leave it to those
> with greater understanding of the lists issues to determine whether
> either of the suggested approaches would create such a barrier.
> Also, I'll point out that implementing them in applications with users
> able to insert the various relative spaces manually could move word
> processors quite a bit closer to desktop publishing solutions'
> capabilities. And to me, it makes more sense to specify horizontal
> widths in relative units rather than absolute measurements.
> =======================================================
> More resources on the Unicode em-based relative spaces:
> <http://www.alanwood.net/unicode/general_punctuation.html>
> <http://www.cs.tut.fi/~jkorpela/chars/spaces.html>
> Unicode typographical spaces discussed:
> <http://www.unicode.org/versions/Unicode4.0.0/ch06.pdf>, pp. 154-155.
> =======================================================
> [1] In my opinion, the ODF specification also needs a
> <text:space-to-fill> tag to indicate the insertion point for space
> needed to justify a line left and right. Where two or more tags occur
> per line, the implementing application should divide the space to fill
> equally in the specified position. For example, take the common book
> header line that includes a page number, the chapter title, and the
> book title:
> ======================
> The One That Got Away          Trolling for Tags             17
> ======================
> The markup would be:
> ======================
> The One That Got Away<text:space-to-fill>Trolling for 
> Tags<text:space-to-fill>17
> ======================
> WordPerfect has supported an equivalent tag at least since WP v. 5.1.
> I think the kind of contortions it presently takes for users to create
> an equivalent effect using ODF are well captured by the KWord
> documentation, section 11, which itemizes 35 steps for users to set up
> such a header for alternating pages with page numbers that remain on
> the outside margin. An Insert > Space to Fill menu option that inserts
> the proposed <text:space-to-fill> tag would dramatically simplify the
> process for users.
> Best regards,
> Marbux

Lars Oppermann <lars.oppermann@sun.com>               Sun Microsystems
Software Engineer                                         Nagelsweg 55
Phone: +49 40 23646 959                         20097 Hamburg, Germany
Fax:   +49 40 23646 550                  http://www.sun.com/staroffice

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]