OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: RE: [dita] hyphens and file names

On Mon, 21 Feb 2005, Esrig, Bruce (Bruce) wrote:

> Robin,
> Thanks for the pointers. The naming conventions thread is interesting.
> The more recent document, OASIS Artifact Naming Guidelines, uses hyphens
> for major divisions, and underscores for minor divisions. If camelCase
> is acceptable for minor divisions, then underscores would be unnecessary.

Excellent point, headed in the same direction apparently as Dana
Spradley's message [1].

The most recent (?) public version of the OASIS Artifact Naming Guidelines
[2] does not expose the set of formal requirements being implemented in
the OASIS specification, so it's not obvious why the EBNF prescribes 
LOALPHA and DIGIT as the only allowable characters in all contexts governed by
the specification.  I can think of contexts (especially in cases of
filenames and URLs) where one might want to allow upper-case characters
and camel case.

I noted in a recent article about the Naming and Design Rules
(NDR) documents published by OASIS, UN/CEFACT, and Navy CIO [3]
that all three NDRs in fact prescribe the use of camel case in
connection with naming components; this would seem potentially
relevant to naming OASIS "artifacts."   The UBL NDR specifically
mentions concerns for "readability" and "semantic clarity" as
justification for use of camel case to mark juncture for
closed compounds. [4]

> The earlier document, Proposed Rules for OASIS Document File Naming
> (ed. Eve Maler), is much less enthusiastic about underscores, leaving
> them as an option but not a recommended option. In RFC 2119 language,
> I'd use "should not" or "not recommended" with regard to underscores.

Yep. I've already stated my agreement with Deborah Aleyne Lapeyre's
arguments against underscores in URLs, but I realize there may be
legitimate difference of opinion, depending upon the underlying

> Quotation from Proposed Rules ...> Hyphens are recommended between
> words within the description and extended description portions, though
> underscores may be used. Hyphens are preferred because they are easier
> to see in displayed URIs and easier to type.

That's from Eve Maler's early draft document, I think, not from the OASIS
document of October 25th...

> Instead, how about ...> Hyphens are recommended between words within
> the description and extended description portions. For minor
> punctuation, meaning punctuation that is more closely binding than
> hyphens, camelCase is recommended and underscores are not recommended.
> [?? rationale ??:] Compared with underscores, hyphens are easier to
> see in displayed URIs and are easier to type. camelCase is normalized 
> away in some contexts, so when it is used, names should be chosen to
> be unambiguous even when reduced to a single case.

If the draft OASIS document is put up for public review, that will be
a key opportunity for you and others on this list to provide input
(and argumentation) of this kind.

I must clarify in this context that despite my email address domain
'oasis-open.org', I am not commenting as an official representative
of OASIS administration.

Mary McRae is the OASIS contact person for this list, and questions
about OASIS norms should be directed to her.

Mary's coordinates:
   Mary P McRae
   Manager of TC Administration
   email: mary.mcrae@oasis-open.org  
   web: www.oasis-open.org <http://www.oasis-open.org/>  
   phone: 603.232.9090
   cell: 603.557.7985

- Robin Cover

> Best wishes,
> Bruce Esrig
> ===
> Postscript: Kavi is a marvelous facility, but I was thinking something much more mundane. Just a misunderstanding between people: "Could you please post this?" "How's this?" "Oh, why use underscores in the URL?"

[1] http://lists.oasis-open.org/archives/dita/200502/msg00033.html
[2] http://lists.oasis-open.org/archives/chairs/200410/msg00056.html
[3] http://xml.coverpages.org/ni2005-01-31-a.html
 "XML Naming and Design Rules Specifications Published by OASIS,
  UN/CEFACT, and Navy CIO"

[4] OASIS UBL NDR document (excerpts):


xsd - represents W3C XML Schema Definition Language.
If a concept, the words will be in upper camel case,
and if a construct, they will be in lower camel case.

XML is case sensitive. Consistency in the use of case
for a specific XML component (element, attribute, type)
is essential to ensure every occurrence of a component
is treated as the same. This is especially true in a
business-based data-centric environment such as
what is being addressed by UBL. Additionally, the use
of visualization mechanisms such as capitalization
techniques assist in ease of readability and ensure
consistency in application and semantic clarity. The
ebXML architecture document specifies a standard use
of upper and lower camel case for expressing XML
elements and attributes respectively.  UBL will adhere
to the ebXML standard. Specifically, UBL element and
type names will be in UpperCamelCase (UCC).

[GNR8] The UpperCamelCase (UCC) convention MUST be
used for naming elements and types.

UBL attribute names will be in lowerCamelCase (LCC).
[GNR9] The lowerCamelCase (LCC) convention MUST be
used for naming attributes.

> -----Original Message-----
> From: Robin Cover [mailto:robin@oasis-open.org]
> Sent: Friday, February 18, 2005 5:49 PM
> To: Esrig, Bruce (Bruce)
> Cc: 'Deborah Aleyne Lapeyre'; dita@lists.oasis-open.org
> Subject: RE: [dita] hyphens and file names
> On Fri, 18 Feb 2005, Esrig, Bruce (Bruce) wrote:
> > > 5. And then there is the issue of files names that turn
> > into parts of URNs, donšt get me started!
> > 
> >
> > This one was the last straw for me, and the reason I raised the
> > question. I sent someone a request for a URL with a filename in
> > it, and it came out with underscores. It has happened again since.
> > Here's a chance for the DITA TC to do something absurdly small    
> > yet exemplary ... standardize on hyphens only in identifiers.
> Bruce, have a look at the earlier (October 25) draft of the 
> "OASIS Artifact Naming Guidelines".  I don't know what you mean
> by "and it came out with underscores" but if you're talking
> about Kavi or the TC Process tools, I'd bet that these draft
> naming guidelines are the culprit, ultimately.  URLs:
> http://lists.oasis-open.org/archives/chairs/200410/msg00056.html
> http://lists.oasis-open.org/archives/chairs/200410/doc00003.doc (main doc)
> http://lists.oasis-open.org/archives/chairs/200410/doc00002.doc (diff)
> Since this draft has not yet been put out for formal review
> by the membership (only informally to the Chairs), I would hope there's
> a good chance to have some broader input on various topics,
> including the notion of converting hyphens to underscores in the
> construction of a URL for a URN-based artifact name.
> I'm unfamiliar with all the computing history around hyphens
> and underscores as name characters (I do recall SGML), but in the
> modern context where filenames get used in URLs, the matter
> of underscore "disappearing" (as Deborah puts it) feels like
> a real problem.  When viewing an address in a typical browser,
> or when sending the HTML document to the printer, an
> troubling ambiguity is introduced: fromm visual, we don't know
> whether the character is SPACE or UNDERSCORE.
> There was a famous article about a train wreck and (non) use
> of SGML. I can envision a similar article about the time someone
> lost a big contact because they had only 20 seconds to
> communicate on a cell phone (nearly dead battery), and
> told a business partner to look at *this vital information* at
> "this URL..."  He reads the URL from the paper printout and
> says: 'http COLON [hmmm]-SPACE-[hmmm]-SPACE-[hmmm]-SPACE-[hmmm]-SPACE-
> [hmmm]-SPACE-[hmmm]-SPACE (etc)'... <phone dies> but the URL entered as
> such failed, and so did the bid for the contract. The Web lookup
> fails, and the poor contract writer assumes he made a mistake
> writing down the URL. But he didn't: the real URL did not actually
> contain any SPACE characters.
> ;-)
> -rcc
> > 
> > > all the rage in the URN and Unix communities
> > 
> > One reason to avoid hyphens in Unix was to avoid confusion with the flag convention ("cmd -flags").
> > 
> > Second, spaces were token delimiters on the command line, so file names couldn't contain spaces, and underscores were the most unobtrusive alternative. At that time, underlining of running text was a novel and underutilized feature that was only supported on certain terminals, so there was no conflict with underscores. And of course, URLs didn't exist yet.
> > 
> > Rubato,
> > 
> > Bruce
> > 
> > -----Original Message-----
> > From: Deborah Aleyne Lapeyre [mailto:dalapeyre@mulberrytech.com]
> > Sent: Friday, February 18, 2005 3:18 PM
> > To: dita@lists.oasis-open.org
> > Cc: Deborah Aleyne Lapeyre; dita@lists.oasis-open.org
> > Subject: RE: [dita] hyphens and file names
> > 
> > 
> > I know they are all the rage in the URN and Unix
> > communities, but I have never liked underscores
> > in file names. In vaguely decreasing order of
> > importance, I dislike underscores because:
> > 
> > 1. Lack of distinctness and clarity
> > 
> > My primary objection to underscore is that in file names
> > it is frequently unclear. When you underline text, a space
> > becomes indistinguishable from an underscore. The
> > underscores can't really be seen.
> > 
> >    a. Many older websites indicate links by turning the text
> >       blue and underlining it. The underscores vanish.
> > 
> >    b. Some editing packages default to underlines to show
> >       change (effectivity) or badly spelled words.
> >       Yes, you could change the default but many don't.
> > 
> > 2. Hard to see - Underscores are (to my eyes) hard to see,
> > especially in print with close leading.
> > 
> > 3. Confusion with commands. I avoid periods in element names
> > because they look like classes and mess with the java folks
> > heads. Similarly "HTTP_Get" and it's ilk are commands in
> > my world. Similarly, underscore is a command character
> > in LaTeX.
> > 
> > 4. Underscore has a history that hyphens lack.
> > 
> > Underscores has been used historically as an "I can't cope"
> > character. Certain older version of both IE and some Adobe products,
> > inserted underscores when they could not cope with a character
> > in a filename
> >           deb.taz.zip ==> deb_tar.zip
> > for example, so many people (and some software) treat
> > underscore in certain locations in a file or path name
> > as errors or artifacts.
> > 
> > Software like IDEAs converted all characters (such as #,%,
> > {,},(,), etc.) to underscore in filenames.
> > 
> > 4. Accessibility -
> > On most modern keyboards, underscore is a shift-click and
> > hyphen is a single click. Folks using their feet or a pen
> > in their teeth avoid shifting and all double motions.
> > (Yes, you can program them to be single, so this may be
> > a wash.)
> > 
> > Pronouncing software folks used to dislike underscore
> > (don't know if this one has been solved more recently)
> > because:
> > a) There were too many different ways to pronounce it.
> > b) There was no one-syllable way to pronounce it (whereas
> >     hyphen can use "dash").
> > c) It was tricky (slower) to indicate in speech the difference
> >     between a character underscore that stands alone (or
> >     does it underline a space?) and an underscored
> >     character or word.
> > 
> > 5. An then there is the issue of files names that turn
> > into parts of URNs, donšt get me started!
> > 
> > --Debbie
> > 
> > 


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]