dita message
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]
Subject: RE: [dita] Proposed revision for the keyword definition (was Keywords inDITA)
- From: Michael Priestley <mpriestl@ca.ibm.com>
- To: "Paul Prescod" <paul.prescod@blastradius.com>
- Date: Mon, 14 Mar 2005 20:29:37 -0500
I agree that we should vote, but I will
add a few more cents to the discussion. I will expand on a very important
difference between <keywords> and <keyword> in DITA.
The semantic "keywords for the
document" is associated with the <keywords> element in the prolog,
and is clearly documented there. The <keywords> element in the prolog
can contain <keyword>, <apiname>, <indexterm>, and many
other things. Erik has suggested it should also be allowed to contain <term>,
and I buy that for post-1.0.
<keyword> in DITA has absolutely
no semantic significance. It just marks a noun or noun phrase that is of
some, as yet unknown, special significance. The specializations give it
that significance. In the absence of a specialization, it just marks some
kind of significant word. It is the word or noun-phrase equivalent of <ph>
for sentences, or <section> for divisions within a topic body, or
<topic> for entire topics. See the topic "Limits of specialization",
subtopic "Specialize from generic elements", in the architectural
spec.
I can wish that we had chosen a different
label for the element, but it's too late for that. If someone marks up
<keyword>assert</keyword> they will not get monospace; they
might have better luck with <codeph>, <parmname>, or <cmdname>,
all of which have an associated semantic that <keyword> in DITA,
despite your expectations, simply does not.
With respect to <apiname>, I do
not believe we need to offer guidance as to whether an <apiname>
is a keyword for the topic as a whole. If it is repeated in the <keywords>
element in the prolog, then it is a key word for the topic; if it is not,
then it isn't. This is an author's decision to make, and has nothing to
do with the inherent semantic of whether or not the element is an <apiname>.
It would not make sense, for example,
to say that a programming topic that mentions fifteen APIs must have all
or none of them as keywords for the topic as a whole. It is much more likely
that only some of them are core, while others are incidental, perhaps even
occurring in the context of a code snippet without being discussed at all.
Being an <apiname> is an inherent
property, while being a key word for the topic is a contextual property:
hence the need to place the <apiname> in a special context (the <keywords>
element) when it does act as a key word for the entire topic.
Michael Priestley
mpriestl@ca.ibm.com
"Paul Prescod"
<paul.prescod@blastradius.com>
03/14/2005 08:06 PM
|
To
| Michael Priestley/Toronto/IBM@IBMCA,
"Dana Spradley" <dana.spradley@oracle.com>
|
cc
| "Dana Spradley"
<dana.spradley@oracle.com>, <dita@lists.oasis-open.org>, "Don
Day" <dond@us.ibm.com>, "Erik Hennum" <ehennum@us.ibm.com>,
"JoAnn Hackos" <joann.hackos@comtech-serv.com>, "Rob
Frankland" <robf@rascalsoftware.com>
|
Subject
| RE: [dita] Proposed revision
for the keyword definition (was Keywords in DITA) |
|
I'm still here and I think that
our problem in converging on a solution is caused by the fact that a) there
isn't really consensus on the goal (some want the meaning of keyword to
be invariant and others want it to have two meanings) and b) the spec is
too close to being finished to clean up the inheritance structure so the
solution must live with the inheritance structure as-is.
Micahel, your text below implies
that it is the specializations that will have "extended processing".
But I think that most people expect (no matter which definition of keyword
is in use) that <keyword>'s themselves will have typically
have special processing. When "assert" is a keyword (inline)
you probably expect to format it in a monospace font. When "installation"
is a keyword (at the topic level) then you expect search-oriented processing.
"You
might mention an <apiname> or <wintitle> in discourse without
it being a core keyword for the topic as a whole."
This is a core modelling question.
I think that we need to offer guidance as to whether the apiname IS or
IS NOT a keyword for the topic as a whole. People are writing code based
on the specs and need to know what to do.
Although we are all very friendly
and polite I think that there are two camps. One wants the HTML meaning
of keyword to override everywhere. The other wants the two meanings to
co-exist in some form (i.e. to say that a keyword in a topic body does
not necessarily apply to the whole topic). Given the time restrictions,
we should probably just vote on selecting one of two proposed wordings.
From: Michael Priestley [mailto:mpriestl@ca.ibm.com]
Sent: Monday, March 14, 2005 4:33 PM
To: Dana Spradley
Cc: Dana Spradley; dita@lists.oasis-open.org; Don Day; Erik Hennum;
JoAnn Hackos; Paul Prescod; Rob Frankland
Subject: Re: [dita] Proposed revision for the keyword definition (was
Keywords in DITA)
I think that may be a little stronger than it can afford to be. Keep in
mind that <apiname> and <wintitle> elements are both specializations
of <keyword>, so everywhere you refer to an <apiname> or a
<wintitle> it would also be correct to just use <keyword>.
You might mention an <apiname> or <wintitle> in discourse without
it being a core keyword for the topic as a whole. I think this gets at
what Paul was saying about different explanations being required when the
markup occurs in the <keywords> section (where they are keywords
for the whole doc) versus in body content (where they are simply key words
of some variety, without being too specific about exactly why they're important).
How about:
A <keyword> is any word or noun phrase that has special significance,
for example because it a is a key word for the topic, or because it is
a part of the user interface or of a programming language. Do not use <keyword>
when a more appropriate domain-specific equivalent is available, for example
<wintitle> or <apiname>.
Specialized elements derived from <keyword> may also have extended
processing, such as different formatting or automatic indexing. If the
keyref attribute is used, or some other method of key-based lookup based
on the value of the element itself, then the keyword can be turned into
a hyperlink on output (not currently supported).
When DITA topics are output to XHTML, any <keyword>
elements in the <keywords> element are placed in the Web page metadata.
I'm avoiding talking about them being used for indexing/search when they
occur in the body of the text, because they may be keywords in the DITA
sense without being key words for the topic as a whole.
Michael Priestley
mpriestl@ca.ibm.com
Dana Spradley <dana.spradley@oracle.com>
03/14/2005 06:34 PM
|
To
| Dana Spradley <dana.spradley@oracle.com>
|
cc
| Erik Hennum <ehennum@us.ibm.com>,
dita@lists.oasis-open.org, Don Day <dond@us.ibm.com>, JoAnn Hackos
<joann.hackos@comtech-serv.com>, Paul Prescod <paul.prescod@blastradius.com>,
Rob Frankland <robf@rascalsoftware.com>
|
Subject
| [dita] Proposed revision
for the keyword definition (was Keywords in DITA) |
|
Okay, maybe Paul's gone home already - here's my proposal for a replacement
definition for keyword in v1.0 of the language spec, my changes/additions
in green:
keyword
A <keyword> is a word that encapsulates the significance of a topic.
If you searched on a keyword
and retrieved the topic it appears in, you would be happy with the result.
Being so central to the meaning
of a topic, keywords typically appear inline in the body of the text. They
can either be marked-up as such
in place, or reproduced in the <keywords> element directly as metadata.
Compare indexterm, kwd.
Specialized elements derived from <keyword> may also have extended
processing, such as different
formatting or automatic indexing. If the keyref attribute is used, the
keyword can be turned into a
hyperlink on output (not currently supported).
When DITA topics are output to XHTML, any <keyword> or <indexterm>
elements in the <keywords>
element are placed in the Web page metadata. In addition, any index terms
in this context are also used
for supported index processing (for example, for print versions). Any
<keyword> or <indexterm> elements
appearing in the body of the text are also reproduced, uniquely, in the
Web page metadata (not currently
supported).
For ease of reference, here's what's there currently (changed section in
red):
keyword
The <keyword> element identifies a keyword or token, such as a single
value from an enumerated list,
the name of a command or parameter, or a lookup key for a message (contrast
with term).
Specialized elements derived from <keyword> may also have extended
processing, such as different
formatting or automatic indexing. If the keyref attribute is used, the
keyword can be turned into a
hyperlink on output (not currently supported).
When DITA topics are output to XHTML, any <keyword> or <indexterm>
elements in the <keywords>
element are placed in the Web page metadata. In addition, any index terms
in this context are also used
for supported index processing (for example, for print versions).
--Dana
Dana Spradley wrote:
I agree, Erik - which is why I'd like to see the <keyword> definition
changed in the draft spec to reflect one single meaning - the meaning implied
by the meaning of <keywords>, and not a meaning that duplicates the
meaning of <kwd> - than two divergent ones.
After that, I also agree that a userful enhancement down the road would
be replacing <keywords> with something like <topicwords> or,
say, <semes> or <coverTerms> or <conceptualBreadcrumbs>
or <searchHooks> or <topicHandles> or something, and including
all metadata elements that seek to encapsulate the overall gist of the
topic into it, if they aren't put inline in the body of the text instead.
I'm new to the TC, and hesitate to propose a replacement definition myself.
Paul Prescod, do you feel like risking a definition in advance of tomorrow's
meeting? Your offer to try your hand did, after all, begin this subthread.
--Dana
Erik Hennum wrote:
Hi, Dana, JoAnn, and Rob:
I'd like to submit some reservations about defining elements based on the
expected output instead of the content semantics:
- An HTML generator could legitimately choose to populate
the keyword metadata with semantic words that are delimited by <term>
and <keyword> elements within the text.
- A PDF generator could legitimately choose to display semantic
words associated with the topic as a whole in the page header.
- A specialization designer should be able to specialize
an element once to indicate a particular vocabulary (for instance, <chemicalterm>
or <programword>) regardless of whether an instance of the vocabulary
appears in the text or is associated with the topic as a whole. That's
possible only if the base element can appear in both contexts. Here's an
example:
<metadata>
<keywords>
<chemicalterm>molecule</chemicalterm>
<programword>element</programword>
</keywords>
</metadata>
...
<p>You list each <chemicalterm>atom</chemicalterm> in
the <programword>array</programword>....</p>
If anything, I'd suggest that the culprit in creating confusing expectations
might be the <keywords> element. A metadata <topicwords> element
that can contain <keyword>, <term>, or <indexterm> might
be better, but I wouldn't expect serious consideration of that thought
until after DITA 1.0
The need for enhancement never ends, and if we try to squeeze in this one,
I'm sure lots of others will come out of the woodwork (such as the deferred
<data> element).
Hoping that's useful,
Erik Hennum
ehennum@us.ibm.com
Dana
Spradley <dana.spradley@oracle.com>
erratum: "otherwise you're sending mixed messages..."
In fact, it seems to me this whole discussion was provoked by a bad definition
for <keyword> in the language spec, which defined it as a keyword
in the technical programming sense, while from the <keywords> definition
you would have expected it to be defined as in DocBook.
We could solve the entire issue by just revising that <keyword> definition
to be what <keywords> expects.
If people feel there is a need for keywords in the technical sense to migrate
beyond the confines of syntax diagrams, then that's a separate issue for
the folks working on the Programming Domain vis-a-vis the <kwd> element
- which *is* defined as a keyword in the technical programming sense.
--Dana
Dana Spradley wrote:
I agree on the DocBook part, but disagree on the <keyword>
in other contexts is more like a word from an API or language clause.
The <kwd> element exists for that.
<keyword> outside <keywords> should have the same meaning it
does within - outwise you're sending mixed messages to authors, and mixing
up the use.
JoAnn Hackos wrote:
Sounds like a good solution.
JoAnn
JoAnn T. Hackos, PhD
President
Comtech Services, Inc.
710 Kipling Street, Suite 400
Denver, CO 80215
303-232-7586
joann.hackos@comtech-serv.com
http://www.comtech-serv.com
From: Rob Frankland [mailto:robf@rascalsoftware.com]
Sent: Monday, March 14, 2005 9:29 AM
To: 'Paul Prescod'; JoAnn Hackos; 'Don Day'
Cc: dita@lists.oasis-open.org
Subject: RE: [dita] Keywords in DITA
I agree, having followed this thread. Your suggested solution covers both
use cases. I believe the largest number of users will want the HTML/Docbook
usage and this enables the programmer writers to meet their needs as well.
Rob
From: Paul Prescod [mailto:paul.prescod@blastradius.com]
Sent: Tuesday, March 08, 2005 5:46 PM
To: JoAnn Hackos; Don Day
Cc: dita@lists.oasis-open.org
Subject: RE: [dita] Keywords in DITA
Okay, an emerging consensus seems to be that <keyword> in <keywords>
means <keyword> in the HTML/Docbook sense.
http://www.docbook.org/tdg/en/html/keyword.html
. It is typically hidden from the user as metadata and embedded in the
HTML meta tag.
<keyword> in other contexts is more like a word from an API or language.
Should we just document it that way? If so, I can suggest some wordings.
Paul Prescod
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
| [List Home]