Yes,
with one day remaining, we can't complete this analysis and design a solution
without delaying the issuance of the reviewable materials for DITA
1.0.
Perhaps we could amplify in the language reference that the
<keywords> context is for semantic words that are helpful in knowing what
the topic is about. Any keyword-like markup in the content context should be
consistently used, either by providing distinctive elements or special
attributes, so that it will later be possible to determine from the markup
whether the semantic term being marked up should be (treated as though it is)
repeated in the <keywords> context.
This
would permit us to acknowledge the existence of strategies that copy semantic
words into the <keywords> context and other similar contexts without
picking a specific strategy at this time.
The
way the text below is written makes it sound as though some such copying might
be supported as a DITA feature in the future.
Best
wishes,
Bruce
Esrig
======
The <keywords> element contains a list
of expressions (using indexterm or
keyword markup) that can be used by a
search engine to select the
topic.
Some DITA environments
include mechanisms that supplement the <keywords> element by copying
information from keyword-like elements in the body. Forward-looking
environments, whether or not they currently have this mechanism, may wish
to specify particular elements or special attributes that explicitly indicate
when such copying would be appropriate.
When DITA topics are output to XHTML, any
<keyword> or <indexterm> elements in the <keywords> element are placed in the Web page metadata. In
addition, any index terms in this context are also used for supported index processing (for example,
for print versions).
Hi, Esteemed DITA Committee-folk:
The discussion here (good thread)
seems to point up the problem as being that the <keyword> element in the
prolog really only captures the reference half of the semantic words. There is
also the conceptual half of the semantic words, which DITA marks up within
discourse using the <term> element. So, a possible conclusion from the
thread would seem to be that we should add the <term> element to the
prolog.
That way, you can supply semantic words either inline within
discourse or, if they apply to the topic as a whole, in the prolog. If you
need to indicate that a semantic word comes from a specific vocabulary, you
can create a specialization of the <term> or <keyword> element and
use that specialized element everywhere to represent the same semantic. For
instance, an author can identify a specialized <retailterm> or
<sqlword> as such in either discourse or the prolog.
In other
words, we continue in the direction of providing elements based on semantic
rather than use. After all, an application could legitimately choose to
display a <keyword> or <term> from the prolog in formatted output.
For instance, some uses of DITA might benefit from emitting the semantic words
in a running head to flag the topic.
Facing the grim light of day,
however, I don't think we want to hold up DITA 1.0 over adding the
<term> element to the prolog -- after all, there will always be more
enhancements to make. We've deferred a number of enhancement proposals for the
sake of getting the initial release out for the benefit of the community. In
fact, while they're on the table, both <term> and <keyword> might
benefit from an href attribute so they can point at the authoritative
definition for the semantic word (or perhaps for the semantic vocabulary?),
but consideration of that enhancement is best deferred for post-DITA
1.0
Hoping that's useful,
Erik
Hennum ehennum@us.ibm.com
Dana Spradley
<dana.spradley@oracle.com>
Dana Spradley
<dana.spradley@oracle.com>
03/09/2005 04:43 PM |
| I agree wholeheartedly, Bruce.
As here: we've fallen
into ambiguity on several fronts that resulted in some confusion.
I
think we need to restrict <keyword> to the DocBook sense: a word that,
if you searched on it, you'd be happy to find this topic. To be used either
inline if you like - and subclassed if you like that - or as metadata in the
<keywords> element.
As for indicating that something is a keyword
in the technical sense in a programming language - and not a keyword of this
topic itself - so that it can be formatted differently on output or otherwise
processed semantically for some reason, <kwd> would seem to be the
likeliest condidate - with "Compare" references between it and <keyword>
a necessity.
--Dana
Esrig, Bruce (Bruce)
wrote:
1. Yes, just so. We have wider and
wider scopes of application for the markup language.
Syntax
diagrams; language that appears
in programs; language, controls,
and other screen phenomena that appear in user interfaces; and conceptual language that governs how our
audiences think about the work that they perform.
2. We have many ways of
marking up identifiers and other expressions. There are significant differences in purpose among
them, and in some places, those
purposes interact. So we need to discuss those different purposes
explicitly and define and
organize the markup in a way that reflects those different
purposes.
If we do not explicitly design around the purposes, we
instead reason about the effect that we can achieve with each
element. The clashes among
various combinations of effects are leading to the clashes about how to
use the markup, and what to allow
and disallow.
3. As a result of the examination done in item 2, we would
have specific elements, as we do
now, plus a design that explains why those elements cover our
needs, and that systematizes any
future structural specializations and any proposals for similar new
elements.
Best wishes,
Bruce
-----Original Message----- From: Dana Spradley
[mailto:dana.spradley@oracle.com] Sent: Wednesday, March 09, 2005 5:52 PM To: Esrig, Bruce
(Bruce) Cc: Don Day; Paul Prescod; dita@lists.oasis-open.org; JoAnn Hackos Subject: Re: [dita]
Keywords in DITA (example)
Hi
Bruce--
Okay, I think I understand a little better
now.
What you are basically asking for is that the scope of the
<kwd> element be expanded beyond <syntaxdiagram> elements,
so that it can be used to trigger different formatting of programming
keywords on output - and so that writers aren't tempted to use
<keyword> just to mark something as a programming
keyword.
Or am I still missing
something?
--Dana
Esrig, Bruce
(Bruce) wrote:
Hi Dana,
The distinctions were based
on a use case argument that I might not have made
explicit.
Content-related functions to be supported: -
make an identifier stand out (userinput) - indicate that an
identifier has an associated definition (term) - indicate that
an identifier has a pre-determined, fixed interpretation
(wintitle)
Description-related functions to be
supported: - provide good search targets that permit a topic to
be found (keyword) - provide good index entries that permit a
topic to be found (indexterm)
Applications: - general
descriptive text about products, services, and processes -
documentation of concept-rich information such as
software
In the example, in the concept "Snow shovel", the
descriptive keyword "tool" does not appear in the content but
would make a good search target. Reversing the situation: the
content term "scoop" does appear in the content but would not
make a good search target.
Even in documenting a
programming language or a software application, there is a need
for markup for content keywords such as "then" or "OK" which
are usually not the target of a search or an index lookup. If
these get swept up and made into descriptive keywords by an
automatic process, then the value of the results is
reduced.
Best wishes,
Bruce
-----Original
Message----- From: Dana Spradley [mailto:dana.spradley@oracle.com] Sent: Wednesday, March 09, 2005 3:41 PM To: JoAnn
Hackos Cc: Esrig, Bruce (Bruce); Don Day; Paul
Prescod; dita@lists.oasis-open.org Subject: Re: [dita] Keywords in DITA
(example)
According to the spec, JoAnn - which I also
am fairly new to - a <term> will link to its definition
in the glossary in some future DITA development. (I just
looked it up myself)
Similarly, in line with Don's comments
I don't see why inline <keyword> elements shouldn't be
mined to populate the web page metadata on output, in some
future development of the DITA toolset - if desired. Not
everyone wants to make a second labor of putting them in the
<keywords> element.
I think the confusion that
has sparked this discussion comes from insisting too narrowly
on the difference between a keyword in the technical
programming sense ("keyword-of-content") and a keyword in the
metadata sense ("keyword-of-description"). While a difference
certainly exists - it evaporates fairly quickly when we're
talking about *documenting* a programming language.
For
then, the keywords of the language are the keywords of your
descriptions and the keywords you want in your metadata,
right? So if someone is searching for information on how to
use this keyword in a programming sense, they'll find the
pages that describe it.
The definition of <keyword>
in the spec, though, should be tuned up a bit, to avoid (or
deal with) this potential confusion.
--Dana
JoAnn
Hackos wrote:
I don' t understand the <term>
element. Can that now be used? Is there processing
avialable?
JoAnn
JoAnn T. Hackos,
PhD President Comtech Services, Inc. 710 Kipling
Street, Suite 400 Denver, CO
80215 303-232-7586 joann.hackos@comtech-serv.com http://www.comtech-serv.com <http://www.comtech-serv.com/>
------------------------------------------------------------------------ From:
Esrig, Bruce (Bruce) [mailto:esrig@lucent.com] Sent: Wednesday, March 09, 2005 12:40 PM To:
JoAnn Hackos; Don Day; Paul Prescod Cc: dita@lists.oasis-open.org Subject: RE: [dita] Keywords in DITA
(example)
Yes, a good example would help ... anyone
want to buy a snow shovel (see example) ?
Here
goes,
Bruce
=============
First the
hierarchy.
Keyword archetype. -
keyword-as-description o existing
keyword element, in the sense defined by
HTML/Docbook o existing indexterm
element - keyword-as-content o
wintitle o widgettype
o widgetname
Now some markup.
<task
id="snowShovelInventory-task"
xml:lang="en-us"> <title>Checking the inventory of
snow shovels</title> <taskbody>
<context><p>Check whether snow shovels are
available before running out to buy
one.</p></context>
<steps>
<step><cmd>Access the
<wintitle>Inventory
Window</wintitle>.</cmd></step>
<step><cmd>Enter <userinput>snow
shovel</userinput> in the
<widgetname>Item</widgetname> field and
click on
<widgetname>Update</widgetname>.</cmd></step>
<step><cmd>Look at the
<widgetname>Count</widgetname> field
to find out how many snow shovels are in
stock.</cmd></step>
</steps> </taskbody> <related-links>
<link href="../concepts/snowShovel.xml" format="xml"
type="concept">
<linktext>Snow
shovel</linktext></link> <link
href="../reference/inventoryWindow.xml" format="xml"
type="reference">
<linktext>Inventory
window</linktext></link> </related-links> </task>
<concept
id="snowShovel-concept"
xml:lang="en-us"> <prolog><metadata>
<keywords>
<keyword>snow</keyword>
<keyword>tool</keyword>
<indexterm>tools
<indexterm>snow
shovel</indexterm></indexterm>
</keywords></metadata></prolog> <title>Snow
shovel</title> <conbody>
<p>A <indexterm>snow
shovel</indexterm> snow shovel is used to clear the
driveway and sidewalk of snow in the winter. A
good snow shovel has a straight, wide
<term>scoop</term> and a strong
<term>handle</term>. To help snow
come off the scoop, spray the scoop with cooking
spray.</p> </conbody> <related-links>
<link href="../tasks/snowShovelInventory.xml"
format="xml" type="task">
<linktext>Snow shovel
inventory</linktext></link> </related-links> </concept>
<reference
id="inventoryWindow-reference"
xml:lang="en-us"> <prolog><metadata>
<keywords>
<indexterm>windows
<indexterm>Inventory
Window</indexterm></indexterm>
<indexterm>fields
<indexterm>Item</indexterm></indexterm>
<indexterm>fields
<indexterm>Count</indexterm></indexterm>
</keywords></metadata></prolog> <title>Inventory
window</title> <refbody><refsyn>
<p>The <wintitle>Inventory
Window</wintitle> provides a count of
items given the name of the item. The
<widgetname>Item</widgetname> field contains the
name of an item. The
<widgetname>Count</widgetname> field states how
many items with that name are in stock
according to the inventory
records.</p> </refsyn></refbody> </reference>
-----Original Message----- From: JoAnn
Hackos [mailto:joann.hackos@comtech-serv.com] Sent: Wednesday, March 09, 2005 10:45
AM To: Esrig, Bruce (Bruce); Don Day; Paul
Prescod Cc: dita@lists.oasis-open.org Subject: RE: [dita] Keywords in
DITA
I wonder if Bruce could provide an
example of the distinction between keyword as
description and keyword as content. I'm not
certain I understand how they are being distinguished
from this explanation.
JoAnn
JoAnn T.
Hackos, PhD President Comtech
Services, Inc. 710 Kipling Street, Suite
400 Denver, CO 80215
303-232-7586 joann.hackos@comtech-serv.com http://www.comtech-serv.com <http://www.comtech-serv.com/>
------------------------------------------------------------------------
From: Esrig, Bruce (Bruce) [mailto:esrig@lucent.com] Sent: Wednesday, March 09, 2005 1:13
AM To: 'Don Day'; Paul Prescod
Cc: dita@lists.oasis-open.org Subject: RE: [dita] Keywords in
DITA
A future release of the architecture
should provide language to distinguish between
two specializations of the archetype: a
description (keyword-as-description) and an
object (keyword-of-content). This would
provide a clear distinction that would cue
authors and processing about the difference in
purpose. Not all field names
or function names make good search terms, so
including all such identifiers among the candidate
targets for search impairs the specificity
that users want when they search. This means
there is a benefit to being able to mark up
identifiers for special presentation in output
(keyword-of-content) without including them
among search terms.
Indexterm is really a special case of
keyword-as-description. Keyword-as-description
could be permitted in content to permit
authors to identify text that is to be used as a search
target. It is often convenient to mark up
identifying text on first occurrence.
The clash comes when an author wants both
usages simultaneously: keyword-as-description
to indicate a search target and
keyword-of-content because the identifier is a special
identifier in the content. The
keyword-of-content usage must take precedence.
In order to accomodate the keyword-as-description usage,
the author could choose to write some
descriptive text to hold the
keyword-as-description usage, or else place a
keyword-as-description entry in a metadata context.
Although it is tempting to use the
unspecialized markup in these cases, there is
still the question of whether to trigger an index entry,
so a third specialization of the archetype
(keyword-desc-and-content) may be
needed. Best
wishes, Bruce
Esrig
-----Original
Message----- From: Don Day
[mailto:dond@us.ibm.com] Sent: Tuesday, March
08, 2005 11:13 PM To: Paul
Prescod Cc: dita@lists.oasis-open.org Subject: RE: [dita]
Keywords in DITA
I buy it in
the strict sense, Paul, but life can be so darned
non-linear. How about this
scenario:
As a content
owner, I created a domain for marking up both
widgettype and widgetname words in my product
descriptions, both specialized
from keyword. Authors have generally used
these elements to tag names and types throughout
the content. Later, I run a
consolidation tool against my content to
retrieve all elements based on keyword, create a
single copy of each unique
element/value, and put these into the keywords
metadata of the topic as a pre-processed
pool that I intend to use as
search keys. Domain substitution means that the
keywords element can contain keyword as
well as the elements specialized
from it--widgetname and widgettype. Although your
definitions might differentiate the name
as being "API-like" and the type
as metadata, yet both are here, based on the same
element , in both content and metadata
contexts. From my point of view
as a user, there is no need for too fine-grained a
definitional distinction because my domain
specialization and my subsequent
use of the elements in both contexts effectively
makes the distinction moot--the
specialized elements are
describing my product semantically and are providing
the consistent search/relevance
behavior I desired.
My real
world experience bets that most authors will be
inconsistent about what they mark up as
keyword in the metadata vs in
the content. Thus jaded, I'm back to the
suggestion of keeping the description high level.
keyword is just an
archetype--the significant distinctions come when it
is specialized to clearly indicate what it
is for.
Regards,
-- Don Day
<dond@us.ibm.com> Chair, OASIS DITA
Technical Committee IBM Lead
DITA Architect 11501 Burnet Rd.,
MS 9037D018, Austin TX 78758 Ph.
512-838-8550 (T/L 678-8550)
"Where is the wisdom we have lost in
knowledge? Where is the
knowledge we have lost in information?"
--T.S. Eliot "Paul
Prescod" <paul.prescod@blastradius.com>
"Paul Prescod"
<paul.prescod@blastradius.com>
03/08/2005 07:45 PM
To
"JoAnn Hackos"
<joann.hackos@comtech-serv.com>, Don
Day/Austin/IBM@IBMUS
cc
<dita@lists.oasis-open.org>
Subject
RE: [dita]
Keywords in DITA
Okay, an emerging
consensus seems to be that <keyword> in
<keywords> means <keyword> in the
HTML/Docbook sense.
http://www.docbook.org/tdg/en/html/keyword.html . It is typically
hidden from the user as metadata and embedded in the
HTML meta tag.
<keyword> in other contexts is more like a word
from an API or
language.
Should we
just document it that way? If so, I can suggest some
wordings.
Paul Prescod
|