OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [dita] Summary of Main Points Re: Metadata and DITA for Next TC Meeting

This might be at a tangent to the metadata debate, but I hope will be useful.

A few years ago, I had a contract with Smartlogic, and wrote the User Manual for their then product, Ontology Manager. One issue I noticed is that information scientists appear to be quite bad at defining their own industry terms. One person's taxonomy is another's thesaurus is another's ontology. For this reason, I wrote a brief primer to define these terms, at least for the benefit of the manual. I could let folks have a copy, if that would be useful? Please contact me off list.

My next contract was with Elekta. They use Simplified Technical English - Wikipedia due to the volume of translation. STE uses a dictionary to significantly restrict the words used in the documentation. The aerospace industry uses STE for safety reasons.

Recently, I pondered what happens when a website serves up a corpus written with STE. What happens to search results, for instance? I have not seen any research into this, however, it stands to reason that the results would be either excellent or dismal. If the user happens to search for a STE preferred term, then the results will be excellent. If they happen to search for a STE non-preferred term, then the results will be dismal.

A prime application of ontologies is to create a website index that significantly improves search results. Therefore, a company that uses STE might well find itself with the need to also use an ontology.

Dictionaries define words whilst taxonomies, thesauri and ontologies classify them. So, there are fundamental differences. However, both dictionaries and ontologies use preferred and non-preferred terms. So there is also a strong link between them.

They also share one significant issue: they both require continual maintenance. Unfortunately, some companies adopt STE or an ontology search index, and assume it is a one off exercise to create the dictionary or ontology. It is definitely not! Language is not static, and nor are company products and services. I was told that an ontology that is three months out of date is virtually useless, and they might as well start over.

So, if a company does require both a STE dictionary and an ontology, then it makes sense to combine them in some way and halve the maintenance overhead. Search terms harvested from the website via the search engine into the ontology could also provide useful feedback to the STE dictionary.

Now, OK, it might be a huge ask for DITA 2.0 to support dictionaries, taxonomies, thesauri AND ontologies. But the point is that any development that supports one of these should not adversely affect any future development of any other.

Many thanks,

Have gone through a series of emails and TC meeting minutes on the subject of our recent discussions around metadata and DITA, and to summarize things:

Key Issues/Observations Identified on Metadata Usage with DITA: 
  • Perceived limitations to how DITA can work with external taxonomy standards 
  • A preference in the community for wanting to use attributes rather than elements
  • Current inability to use a URI in an attribute
  • While subjectScheme is designed for use with  taxonomies, but is deficient as currently implemented  (comment from Kris that subjectScheme was underspecified in DITA 1.2, and backwards compatibility issues limited what was possible to do in DITA 1.3)

Current Suggestions for DITA 2.0:
• extend SubjectScheme so that it is possible to state that “this is my enumeration value, different from my key name” (Eliot Kimber) This could be done by adding a new enumeration-value element for use within subjectdef element to store a unique ID value alongside the key and readable value (Joe Pairman)
• @props whose value allows URIs; maybe a specialization-based @ whose value is a URI (Eliot Kimber); alternately, create a new, global metadata-specific attribute (“@metadata”? “@taxonomy”?) that could take on this role (Joe Pairman) 
• Create a semantic mapping mechanism to pair the names of DITA elements (specialized or not) with data in an equivalent, external standard or mechanism (Joe Pairman)

Where is this request coming from?
Some DITA practitioners at recent DITA Listening sessions are asking for "better metadata support" within DITA. Reasons are scattered, but include requests for a more "associative metadata model in order to apply it in bulk after the content has been published" (using third-party tools). 

A Possible Role of RDFa?
At the TC Meeting of November 14, RDFa was suggested, and while it was agreed that it could play a role, it was generally agreed that it should a) not be incorporated into core DITA, but instead as a specialization, and b) RDFa usage is on the decline. If there was sufficient interest, a Working Group could be struck to devise a specialization. (This was not a specific motion, and this has not come to pass).

Detailed Timeline (courtesy of Joe Pairman):


Keith Schengili-Roberts
Market Researcher and DITA Evangelist
825 Querbes, Suite 200, Montréal, Québec, Canada, H2V 3X1
tel  + 1 514 279-4942  /  toll free + 1 877 279-4942 

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]