OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

members message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Call for Comment: proposed Charter for Lexicographic Infrastructure Data Model and API (LEXIDMA) TC


To OASIS Members:

A draft TC charter has been submitted to establish the Lexicographic Infrastructure Data Model and API (LEXIDMA) TC. In accordance with the OASIS TC Process Policy section 2.2: (https://www.oasis-open.org/policies-guidelines/tc-process#formation) the proposed charter is hereby submitted for comment. The comment period shall remain open until 23:59 GMT on 05 November 2019.

OASIS maintains a mailing list for the purpose of submitting comments on proposed charters. Any OASIS member may post to this list by sending email to oasis-charter-discuss@lists.oasis-open.org. All messages will be publicly archived at http://lists.oasis-open.org/archives/oasis-charter-discuss/. Members who wish to receive emails must join the group by selecting "join group" on the group home page: http://www.oasis-open.org/apps/org/workgroup/oasis-charter-discuss/. Employees of organizational members do not require primary representative approval to subscribe to the oasis-charter-discuss e-mail.

This call for comment is also available as a Google Doc. See https://tinyurl.com/y65v8ok7. Comments and suggestions may be left on that document.

Comments received will be reviewed by the proposers and a log of the comments and their resolution will be posted to oasis-charter-discuss mailing list before the telephone call with the convener.

A telephone conference will be held among the Convener, the OASIS TC Administrator, and those proposers who wish to attend no more than four days after the comment period closes. The announcement and call-in information will be noted on the OASIS Charter Discuss Group Calendar.

We encourage member comment and ask that you note the name of the proposed TC (LEXIDMA) in the subject line of your email message. Comments received will be reviewed by the proposers and a log of the comments and their resolution will be posted to oasis-charter-discuss mailing list before the telephone call with the convener.

If you wish to be listed as a co-proposer in the Call for Participation, please contact the convener David Filip, Trinity College Dublin (ADAPT), david.filip@adaptcentre.ie no later than 15 November 2019 (the call for participation). For representatives of OASIS organizational members, a statement of support from their Primary Representative will be required.

---

Section 1: TC Charter

(1)(a) Name of the TC

Lexicographic Infrastructure Data Model and API (LEXIDMA)

(1)(b) Statement of Purpose

This committees high level purpose is to create an open standards based framework for internationally interoperable lexicographic work. This TC will be describing and defining standard serialization independent interchange objects based predominantly on state of the art in the lexicographic industry. Defining specific serializations, transaction models, standard interfaces, and web services based on the defined objects and object models is also in scope as far as it facilitates the high level purpose set out here. This TC aims to develop the lexicographic infrastructure as part of a broader ecosystem of standards employed in Natural Language Processing (NLP), language services, and Semantic Web.

Business Benefits

The key business benefit LEXIDMA deliverables aim for is to provide a simple, modular, and easy to adopt data model that will be attractive for all lexicographic industry actors across companies and academia as well as geographic locations. Adoption of that model will facilitate exchange of lexicographic and linguistic corpus data globally and also enable effective exchange with adjacent industries such as language services, terminology management, or technical writing. Semantic interoperability of lexicographic data should help the global lexicographic industry to surpass its current model of creating and curating lexicographic deliverables (such as prominently multi- and monolingual dictionaries) and corpora in linguistically a geographically demarcated silos and create a truly global market for lexicographic data exchange across and among languages and locales.

(1)(c) Scope

The following items belong to the Scope of Work and are expected to be refined as the TC gains additional insights into evolving and culturally diverse lexicographic use cases. Members will gather insights and requirements from consultations with the wider community of industry stakeholders, annual symposia, questionnaires, etc. and use these insights to produce concrete technical deliverables.

i) Define and maintain a serialization independent Data Model for globally applicable use cases in lexicography.

ii) Define and maintain XML, JSON, RDF, and other serializations, as industry or academic needs arise, of the said lexicographic data model.

iii) Define specific standard Application Interfaces (API) and abstract service architectures for various serializations of the lexicographic data model in concert with other related standards and formats (such as TEI, LMF, RDF, JSON-LD, XLIFF, ITS, TBX, etc.) and prominent data models in adjacent industries and verticals, such as terminology management, translation services, web publishing, etc.

iv) Define and describe lossless or nearly lossless mappings between the lexicographic data model and its native normative serializations (developed by this committee) with other common industry and academic serializations such as, prominently, Ontolex-Lemon and TEI Lex-0, define those mappings both in an abstract way and for specific serializations as the need arises.

v) Define and describe informative best practices and abstract services architecture recommendations with regards to usage of the LEXIDMA TC normative deliverables in the lexicographic industry and adjacent industries, terminology management, translation services, web publishing, etc.

(1)(d) Deliverables

The following are high priority technical goals that should be addressed by development of one or more deliverables on OASIS standards track or as committee notes within 24 months from TC initiation:

i) Serialization independent Data Model for Lexicography (DMLex)

ii) XML serialization of DMLex

iii) JSON serialization of DMLex

iv) RDF serialization of DMLex

v) Informative Ontolex-Lemon mapping

vi) Informative TEI Lex-0 mapping

Work on the following may start during the work on addressing of the above high priority goals deliverables or later on given the general sense of urgency for those within the lexicographic industry:

vii) Reference architecture

viii) APIs with various bindings

(1)(e) IPR Mode

This TC will operate under the Non-Assertion IPR mode as defined in the OASIS Intellectual Property Rights (IPR) Policy.

(1)(f) Audience

The expected audience for the work of the LEXIDMA TC includes but is not limited to:

* Lexicographers
* Terminologists
* Multilingual content and software architects and strategists, multilingual content publishers
* NLP services architects and developers
* Owners and managers of lexicographic content
* Software providers for lexicography, corpus management, etc. including producers of language technology components
* Technical communicators employing lexicographic tools or linguistic corpora in the process of multilingual publishing of their content
* Translation service providers and freelance translators who need to use lexicographic tools or products in order to deliver their services

(1)(g) Language

English (UK spelling)

Section 2: Additional Information

(2)(a) Identification of Similar Work

Ontolex-Lemon https://www.w3.org/2016/05/ontolex/

TEI Lex-0 https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html

ISO/TC 37/SC 4 Language resource management ÂLexical markup framework (LMF) Â[multipart]

LEXIDMA TC aims to establish informal liaisons with Ontolex-Lemon and TEI Lex-0 communities as well as formal liaisons with ISO/TC 37 and its subcommittees, in particular SC 2, SC 3, SC 4, and SC 5.

ISO fast tracking through TC 37 or one of the TC 37 SCs will be considered.

(2)(b) First TC Meeting

First TC meeting is planned as a webconference to be held on 16th December 2019 1600 UTC and the GoToMeeting webconferencing facility will be provided by IJS.

(2)(c) Ongoing Meeting Schedule

The TC aims to hold monthly webconferences to be hosted by IJS on their GoToMeeting facility. Meeting frequency will be adjusted when deliverables go for public reviews, OASIS approval etc. A limited number of face-to-face meetings is likely to resolve public review comments and similar. Any such meetings will be announced well in advance to allow membership to make travel plans.

(2)(d) TC Proposers

Simon Krek, Jozef Stefan Institute (IJS), simon.krek@ijs.si

Tomaz Erjavec, Jozef Stefan Institute (IJS), tomaz.erjavec@ijs.si

Iztok Kosem, Jozef Stefan Institute (IJS), iztok.kosem@ijs.si

Milos Jakubcek, Lexical Computing,

milos.jakubicek@sketchengine.co.uk

Ilan Kernerman, K Dictionaries, ilan@kdictionaries.com

David Filip, Trinity College Dublin (ADAPT),

david.filip@adaptcentre.ie

(2)(e) Primary Representatives' Statements of Support

I, Simon Krek (simon.krek@ijs.si), as OASIS primary representative for Jozef Stefan Institute, confirm our support for the proposed LEXIDMA TC charter and endorse our participants listed above.

I, Dave Lewis (dave.lewis@adaptcentre.ie), as OASIS primary representative for Trinity College Dublin (ADAPT), confirm our support for the proposed LEXIDMA TC charter and endorse our participants listed above.

(2)(f) TC Convener

David Filip, Trinity College Dublin (ADAPT), david.filip@adaptcentre.ie

(2)(g) OASIS Member Section

N/A

(2)(h) Anticipated Contributions

ELEXIS Consortium (elex.is) plans to submit their lexicography exchange data model as the initial input for the DMLex deliverable.

(2)(i) FAQ Document

None

(2)(j) Work Product Titles and Acronyms

Data Model for Lexicography (DMLex), Version 1.0
--

/chetÂ
----------------
Chet Ensign
Chief Technical Community Steward
OASIS: Advancing open source & open standards for the information society
http://www.oasis-open.org

Mobile: +1 201-341-1393Â


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]