OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

docbook-tc message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Subject: Re: [docbook-tc] CALS+HTML Table Model


On Thu, Jan 30, 2003 at 11:02:12AM -0600, Paul Grosso wrote:
> Unfortunately, I will have to miss tomorrow's rescheduled
> docbook tc call due to a family emergency.
> 
> The key thing on the agenda for which I might have input
> is the question of tables.  As Norm's noted in the agenda,
> I've provided my input.  I still think the merged model is
> a good plan, and given that I will have to miss the telcon,
> I'm taking the opportunity to recap my thoughts here.
> 
> While it is true that including a "merged table model" in the
> DTD would mean that some semantically invalid tables would pass
> DTD validation, this argument doesn't hold weight for me.  There 
> are lots of semantically invalid tables the DTD allows right now 
> if you consider all the semantic constraints on the various 
> elements and attribute values.  For example, it's easy to say 
> you have a 4 column table and then put in 5 entries.  And it's 
> easy to give the colwidth a completely invalid value.  We all 
> know that the DTD cannot guarantee a valid table already.  And 
> it is clear that the most practical way to create a table is 
> to use a tool that goes way beyond DTD constraints to ensure 
> the creation of valid tables.
> 
> The key reason for allowing a document to mix tables is
> that there are tools that create valid HTML tables and valid
> CALS tables so a user is not unlikely to have some of each.
> It seems we would be doing the user community a service if
> we allow them to include the tables that they already have in 
> their DocBook documents.
> 
> It is true that HTML tables whose table cell contents include
> HTML element markup couldn't be incorporated directly into a
> DocBook document without some modification.  However, I would
> guess that over 80% of all such tables do not contain internal
> markup, and for the 20% that do, it is much simpler to change
> a few <p>'s to <para>'s or whatever than it is to convert the
> entire table structure from HTML to CALS.
> 
> In summary, I think including a merged table model provides
> more user benefits than disadvantages, and I think we would 
> be doing the DocBook user community a service to do this.

Well, I was a bit skeptical that only 20% of HTML tables
have other markup.  So I ran a little perl program through
all the doc HTML files installed under /usr/share/doc on my
Linux system.  Here are the numbers.

I found 7,116 HTML files that contained 32,443 tables.
2,531 of those tables (8%) contained <H1> through <H5>, so
I eliminated those on the assumption they were
page layout tables and would not be converted to DocBook.

Of the remaining 29,912 tables, 28,517 (95%) of them had
HTML markup other than table tags.  Here are the
most popular tags:

90,655  <A>
13,133  <IMG>
11,433  <B>
10,103  <SMALL>
 9,645  <STRONG>
 9,362  <FONT>
 8,912  <UNDERLINE>
 5,568  <BR>
 3,205  <DIV>
 2,608  <I>

I'll admit that this kind of sampling can't possibly
be applied to all situations, but it does give an indication
that HTML markup is pretty common in tables.

-- 

Bob Stayton                                 400 Encinal Street
Publications Architect                      Santa Cruz, CA  95060
Technical Publications                      voice: (831) 427-7796
The SCO Group                               fax:   (831) 429-1887
                                            email: bobs@sco.com


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [Elist Home]


Powered by eList eXpress LLC