TermBase eXchange Link (TBX Link) Specification

TermBase eXchange Link (TBX Link) 1.0 Specification

Initial Draft 0.1

This version:
http://www.xml-intl.com/docs/specification/TBXLink.html
Editors:
Alan K. Melby <akm@byu.edu>
Andrzej Zydroń <azydron@xml-intl.com>
Copyright © The Localization Industry Standards Association [LISA] 2004/5. All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to LISA.

The limited permissions granted above are perpetual and will not be revoked by LISA or its successors or assigns.


Abstract

This document defines the LISA TermBase eXchange Link (TBX Link) Specification. The purpose of this vocabulary is to define a link between a term that is embedded in an XML document and its entry in a corresponding TermBase eXchange (TBX) format document or repository.

Status of this Document

This document constitutes an initial draft for discussion.

This document and the information contained herein is provided on an "AS IS" basis and LISA DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Table of Contents

1. Introduction
2. Key Concepts
2.1. TBX Document
2.2. Termbase
3. General Structure
3.1. Main TBX Link Element
3.2. Term Element
3.4. Attributes
4. Detailed Specifications
4.1. TBX Link Namespace Declaration
4.2. Elements
4.2.1. Main TBX Link Element
4.2.2. Term Elements
4.3. Attributes
4.3.1. TBX Link Attributes

Appendices

A. TBX Link XML Tree Structure
B. TBX Link Document Type Definition and Schema
C. References
D. Glossary

1. Introduction

TBX Link is a namespace based XML notation that enables specific identified terms within an XML document to be linked to a specific TBX - (TermBase eXchange (TBX) format) XML document.

The purpose of the TBX Link specification is to provide a rigorous notation for linking embedded terms in an XML document to a their entries in a TBX document or a TBX database repository.

2. Key Concepts

The use of TBX Link is predicated upon the existence of a TBX Document or database repository that contains the TBX term entries that are being linked to. TBX Link allows individual terms to be linked to such a repository.

2.1. TBX Document

The TBX document is the object that contains the terms that are linked to by the 'termid' attributes of individual terms.

2.2. Termbase

The Termbase resolves the actual identifier of the main TBX document or repository that the indivudual terms are linking to. The Termbase will allow a TBX Link compliant application to resolve the actual identifier and location of the TBX dataset.

3. General Structure

TBX Link provides a very simple namespace based XML notation to allow for the linking of terms to a TBX document.

The TBX Link document model hierarchical structure comprises the following elements:

tbx
This is the top level element for TBX Link.
term
The individual term elements.

An example of TBX Link usage:

<?xml version='1.0' encoding='UTF-8'?>
<doc xmlns:tbx="urn:lisa-tbxlink-tags">
    <tbx:tbx termbase="http://purl.org/xml-intl.com/tbx-link:8700" version="1.0" 
    date="2004-12-18T13:06:52Z" tool-name="XYZ Term Finder" tool-version="1.32" language="en_US">
    <p>An example paragraph with an embedded
	<tbx:term termid="a125fg" termbase="http://purl.org/xml-intl.com/tbx.xml">
	    term
	</tbx:term>
       that is linked to a non-default TBX repository.
    </p>
    <p>A second 
        <tbx:term termid="fde12a">
	   example
	</tbx:term>
       that uses the default TBX repository as specified in the "Termbase"
       attribute of the main "tbx:tbx" element.
    </p>
    </tbx:tbx>
</doc>

3.1. The Main TBX Link Element

The <tbx> element is the top level of the TBX Link hierarchy. It signals the start of the TBX Link namespace DOM tree. Its direct children are one or more <term> elements.

3.2. The Term Element

The <term> element is used link the encompassed term to its entry in the TBX repository.

4. Detailed Specifications

4.1. TBX Link Namespace Declaration

The TBX Link document structure is designed to exist as a namespace so that it can be embedded into any document.

The TBX Link namespace declaration will have the following form:

  xmlns:tbx="urn:lisa-tbxlink-tags"
  

All TBX Link elements will normally be prefixed with the TBX Link namespace identifier tbx:.

4.2. Elements

Elements <tbx>, <term>.

4.2.1. TBX Link

The main TBX Link element has the following format:

<tbx>

TBX Link Element - The <tbx> element.

Required attributes:

termbase - the Termbase identifier for the default TBX Repository that is being linked to.

version - the fixed TBX Link current version id, currently "1.0".

date - the date that the TBX Link namespace was created for this document.

language - the language for the terms being linked to.

tool-name - the tool used to identify the TBX Link terms.

tool-version - the version identifier of the tool used to identify the TBX Link terms.

Optional attributes:

None.

Contents:

Zero or more <term> elements.

4.2.2. Term

The Term element has the following format:

<term>

The TBX Link Term Element.

Required attributes:

termid - The term identifier in the TBX repository.

Optional attributes:

termbase - the Termbase identifier for a non default TBX Repository that is being linked to.

date - the date that the term element was created.

language - the language of the term.

Contents:

The PCDATA contents of the term.

4.3. Attributes

This section lists the attributes used in the TBX Link elements. An attribute is never specified more than once for each element.

TBX Link attributes date, language, version, termbase, termid, tool-name, tool-version,

4.3.1. TBX Link Attributes

date

Date - The date attribute indicates when a given element was created or modified.

Value description:

Date in [ISO 8601] Format. The recommended pattern to use is: CCYY-MM-DDThh:mm:ssZ 
Where: CCYY is the year (4 digits), MM is the month (2 digits), DD is the day (2 digits), hh is the hours (2 digits), mm is the minutes (2 digits), ss is the second (2 digits), and Z indicates the time is UTC time. For example:

date="2002-01-25T21:06:00Z"
is January 25, 2002 at 9:06pm GMT
is January 25, 2002 at 2:06pm US Mountain Time
is January 26, 2002 at 6:06am Japan time

Default value:

Undefined.

Used in:

<tbx>, <term>

language

language - The language for the main tbx or individual term elements.

Value description:

A language code as described in the [RFC 3066]. For more information see the section on xml:lang in the XML specification, and the erratum E11 (which replaces RFC 1766 by RFC 3066).

Default value:

Undefined.

Used in:

<tbx>, <term>

version

Version - The current TBX Link version number.

Value description:

The version number of this tbx document:

Fixed value:

1.0

Used in:

<tbx>.

termbase

Name - The identifier for the TBX repository. This should be in the form of a URL or some other system identifier that allows for the automatic resolution of the TBX repository.

Value description:

The TBX repository.

Default value:

Undefined

Used in:

<tbx>, <term>.

termid

Name - The identifier for the term in the TBX repository.

Value description:

the term key in the TBX repository.

Default value:

Undefined

Used in:

<term>.

tool-name

Name - The identifier of the tool used to insert the TBX Link elements.

Value description:

the name of the TBX Link tool.

Default value:

Undefined

Used in:

<tbx>.

tool-version

Name - The version identifier of the tool used to insert the TBX Link elements.

Value description:

the version identifier of the TBX Link tool.

Default value:

Undefined

Used in:

<tbx>.

A. TBX Link Tree Structure

The following figure shows the possible structure as a tree. Each element is followed by notation indicating its possible occurrence according to the corresponding legend.

(legend: 1 = one
         + = one or more
         ? = zero or one
         * = zero, one or more)

<tbx>1
|
+--- <term>*

B. TBX Link Document Type Definition and Schema

C. References

Normative

[IANA Charsets]
IANA Names for Character Sets. IANA (Internet Assigned Numbers Authority), Aug 2001
[ISO 639]
Codes for the Representation of Names of Languages. ISO (International Standards Organization), Nov 2001.
[ISO 3166]
Codes for the representation of names of countries and their subdivisions. ISO (International Organization for Standardization), Jun 2000.
[ISO 8601]
Representation of dates and times. ISO (International Organization for Standardization), Dec 2000.
[RFC 3066]
RFC 3066 Tags for the Identification of Languages. IETF (Internet Engineering Task Force), Jan 2001.
[TBX 1.0]
TBX 1.0 Specification. LISA (Localisation Industry Standards association), May 2002.
[XML 1.0]
Extensible Markup Language (XML) 1.0 Second Edition. W3C (World Wide Web Consortium), Oct 2000.
[XML Names]
Namespaces in XML. W3C (World Wide Web Consortium), Jan 1999.

Non-Normative

[ISO]
International Organization for Standardization Web site.
[LISA]
Localisation Industry Standards Association Web site.
[OSCAR]
OSCAR (Open Standards for Container/Content Allowing Re-use) Web site.
[OASIS]
Organization for the Advancement of Structured Information Standards Web site.
[Unicode]
Unicode Consortium Web site.
[W3C]
World Wide Web Consortium Web site.

D. Glossary

DTD
An XML document can have an associated Document Type Definition (DTD) that specifies the rules for the structure of the document. Several industries have standardized on various DTDs for the different types of documents that they share.
OSCAR
LISA special interest group (Open Standards for Container/Content Allowing Re-use).
UTC
UTC stands for Coordinated Universal Time.
XML
eXtensible Markup Language.