OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

office message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Index Marks

Hello all,

one of my action items was to look at the different types of index 
marks, and to see whether they could be combined.

There's four types of index marks in the base spec:
1) bibliography
2) toc (table-of-content)
3) alphabetical-index
4) user-defined

To make things short, user-defined indices are a kind of souped-up table 
of content, but the others are different in functionality.

Bibliography marks are conceptually different from all other index 
marks: They contain data for a bibliographic reference. They don't mark 
arbitrary different text, but rather display a suitable (and 
configurable) identifier for their bibliographic reference. In many 
ways, they are more like fields, with the specific function that an 
index over all 'fields' of this type can constructed.

A table-of-content lists content elements in document order. This 
usually contains all headers in the document, but it can also (or 
instead) include regions of text marked for toc-inclusion. This is what 
toc-marks are used for. The index contains toc elements in document 
order, and tehre is a one-to-one relationship between toc elements and 
toc entries.

An alphabetical index lists content elements in alphabetical order. 
Several such elements can be grouped together. A group of elements can 
have a main entry, and group members can be discerned by a secondary 
keywords. To support languages where alphabetical sorting is impractical 
(due to rather large alphabets), the index values and keywords can be 
given in a phonetic spelling, which is then used to sort the entries.

A user-defined index is mostly like a table-of-content, except there can 
be several ones (with distinct markers), and the indices (not the index 
marks) have some options that aren't traditioanlly used with tocs. 
Essentially, one could consider a toc as a special user-defined index.

Common to TOC/user/alphabetical marks:
- empty stand-alone elements, or -start/-end pairs
- text:id             [match -start and -end elements]
- text:string-value   [for empty if index text differs from marked text]
   text:string-value-phonetic  [string-value, for phonetic sorting]

Specific attributes for TOC/user/alphabetical:

- text:outline-level  [the level this entry appears in index]

- text:outline-level  [like toc]
- text:index-name     [there can be several user-defined indices]

- text:key1           [keyword]
   text:key1-phonetic  [key1, for phonetic sorting]
- text:key2           [supplementary keyword]
   text:key2-phonetic  [key2, for phonetic sorting]
- text:main-etry      [one of several identical entries can be declared
                        main entry]

So, combine or not combine? Well, one could combine them 'by force', 
i.e. make a single element with a type attribute and then make the other 
attributes type-dependent. This doesn't really make sense to me. What 
could make sense is to combine the toc mark and the user-defined mark. 
The only problem I have with that is that a table-of-content is 
something I would consider to have semantics, while a user-defined index 
doesn't. So the trade off appears to be somewhat more elegant format 
definition vs somewhat more semantic content.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]