OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

lexidma message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [lexidma] Module-by-module proposal


Hi everyone,

Here is an updated version of my proposal, with all the feedback from this e-mail thread worked in.

M.
# DMLex
This is a recommended object model (i.e. not a serialization, not a data exchange format) for newly created, born-digital dictionaries (reimagined here as "lexicographic resources") which are simultaneously human-oriented and machine-understandable.


## DMLex Core
The DMLex Core is for monolingual lexical resources, where headwords, definitions, examples etc. are all in one and the same language.

### `LexicographicResource` object type
A data set which can be viewed and used by humans as a dictionary and - simultaneously - ingested, processed and understood by software agents as a machine-readable database. Terminological note: *lexicograpic* resource, not *lexical*.

Properties:
- `language` (optional, IETF language code)
	- The language of headwords, definitions, examples.	
- `transcriptionScheme` (optional, reference to some external authority - which?)
	- The scheme (e.g. IPA) in which the `transcription` property of `Pronunciation` objects is given.

Children:
- `Entry` (one or more)

### `Entry` object type
A part of a lexicographic resource which contains information related to exactly one headword.

Child of:
- `LexicographicResource`

Properties:
- `headword` (non-empty string)
	- The headword can be a single word, a multi-word expression, or any expression in the source language which is being described by the entry in the lexicographic resource.
- `homographNumber` (number, optional)

Children:
- `PartOfSpeech` (zero or more)
- `Pronunciation` (zero or more)
- `InflectedForm` (zero or more)
- `Sense` (zero or more)

### `Sense` object type
A part of an entry which groups together information relating to one of the (possibly multiple) meanings (or meaning potentials) of the entry's headword.

Child of:
- `Entry`

Properties:
- `listingOrder`
	- Can be implicit from the serialization.
- `indicator` (optional, non-empty string)
	- A short statement that indicates the meaning of a sense and permits its differentiation from other senses in the entry.
- `definition` (optional, non-empty string)
	- A long statement that describes and or explains the meaning of a sense.

Children:
- `Usage` (zero or more)
- `Example` (zero or more)

### The difference between entries and senses

An **entry** is a container for "formal" properties of the headword such as orthography, morphology, syntax and pronunciation. A **sense** is a container for those of the headword's properties which are statements about semantics and pragmatics.

### `PartOfSpeech` object type
Any of the word classes to which a lexical item may be assigned, e.g. noun, verb, adjective, etc.

Child of:
- `Entry`

Properties:
- `value` (non-empty string)
	- Can be constrained by the DMLex Controlled Vocabularies Module.

### `Usage` object type
An indication of some restriction on the use of the lexical item. The restriction can be pragmatic (time, region, register), semantic (domain, semantic type) or formal ('no plural').

Child of:
- `Sense`
- `Pronunciation`
- `InflectedForm`

Properties:
- `value` (non-empty string)
	- Can be constrained by the DMLex Controlled Vocabularies Module.
	- Its type (eg. whether register, temporal, geographic etc) can be specified by the by the DMLex Controlled Vocabularies Module.

### `Pronunciation` object type
Information about the pronunciation of its parent.

Child of:
- `Entry`
- `InflectedForm`

Properties (at least one):
- `transcription` (non-empty string)
- `recording` (string: name or URL of a sound file)

Children:
- `Usage` (zero or more)

### `InflectedForm` object type
An inflected headword is a form of the inflectional paradigm of its parent.

Child of:
- `Entry`

Properties:
- `label` (non-empty string) e.g. 'plural'
	- Can be constrained by the DMLex Controlled Vocabularies Module.
- `value` (non-empty string)

Children:
- `Usage` (zero or more)
- `Pronunciation` (zero or more)

### `Example` object type
An instance of a lexical item's usage in a specific sense.

Child of:
- `Sense`

Properties:
- `text` (non-empty string)


## DMLex Bilingual Module
Extends DMLex Core to support the encoding of bilingual lexicographic resources.

### Extensions to `LexicographicResource` object type
Additional properties:
- `translationLanguage` (optional, IETF language code)
- `transcriptionScheme` (optional, reference to some external authority - which?)
	- The scheme (e.g. IPA) in which the `transcription` property of `TranslationPronunciation` objects is given.

### Extensions to `Sense` object type
Additional children:
- `HeadwordTranslation` (zero or more) 

### `HeadwordTranslation` object type
The translation equivalent of the headword in one of its senses.

Child of:
- `Sense`

Properties:
- `text` (non-empty string)
	- Can be a single word, a multi-word expression, or indeed any expression in the target language.

Children:
- `TranslationPartOfSpeech` (zero or more)
- `TranslationUsage` (zero or more)
- `TranslationPronunciation` (zero or more)
- `TranslationInflectedForm` (zero or more)

### `TranslationPartOfSpeech` object type
Any of the word classes to which the translation may be assigned, e.g. noun, verb, adjective, etc.

Child of:
- `HeadwordTranslation`

Properties:
- `value` (non-empty string)
	- Can be constrained by the DMLex Controlled Vocabularies Module.

### `TranslationUsage` object type
An indication of some restriction on the use of its parent. The restriction can be pragmatic (time, region, register), semantic (domain, semantic type) or formal ('no plural').

Child of:
- `HeadwordTranslation`
- `TranslationPronunciation`
- `TranslationInflectedForm`

Properties:
- `value` (non-empty string)
	- Can be constrained by the DMLex Controlled Vocabularies Module.
	- Its type (eg. whether register, temporal, geographic etc) can be specified by the by the DMLex Controlled Vocabularies Module.

### `TranslationPronunciation` object type
Information about the pronunciation of its parent.

Child of:
- `HeadwordTranslation`
- `TranslationInflectedForm`

Properties (at least one):
- `transcription` (non-empty string)
- `recording` (string: name or URL of a sound file)

Children:
- `TranslationUsage` (zero or more)

### `TranslationInflectedForm` object type
A form of the inflectional paradigm of its parent.

Child of:
- `HeadwordTranslation`

Properties:
- `label` (non-empty string) e.g. 'plural'
	- Can be constrained by the DMLex Controlled Vocabularies Module.
- `value` (non-empty string)

Children:
- `TranslationUsage` (zero or more)
- `TranslationPronunciation` (zero or more)

### Extensions to `Example` object type
Additional children:
- `ExampleTranslation` (zero or more)

### `ExampleTranslation` object type
The translation of an example.

Child of:
- `Example`

Properties:
- `text` (non-empty string)


## DMLex Entry Structuring Module

### `SenseGroup` relation type
Represents the fact that a group of senses (all belonging to the same entry) should be grouped when presented to a human user. Typically, when an entry has a large number of senses, it is a convenience to the human user to group them into a smaler number of groups by some broad criterion, such as by part of speech or by semantic similarity.

Participants:
- `Sense` (two or more)

Properties:
- `indicator` (optional, non-empty string)
	- A short statement that indicates the broad meaning that unites the senses in this group and permits their differentiation from other senses in the entry.

### `Subsense` relation type
Represents the fact that one sense (the subordinate sense) should be treated as a subsense of another sense (the subordinate). Both senses belong to the same entry.

Participants:
- the superordinate `Sense` (exactly one)
- the subordinate `Sense` (exactly one)

### `Subentry` relation type
Represents the fact that one entry (= the subordinate entry) should be treated as a subentry inside the sense (= the superordinate sense) of another entry.

Participants:
- the superordinate `Sense` (exactly one)
- the subordinate `Entry` (exactly one)

Properties:
- `listingOrder`
	- Can be implicit from the serialization.


## DMLex Crossreferencing Module

### `Variant` relation type
Represents the fact that two entries are understood by the lexicographer as variants (for example masculine and feminine counterparts, spelling variants).

Participants:
- `Entry` (two or more)

### `Opposition` relation type
Represents the fact that two senses (typically - but not necessarily - belonging to two different entries) have opposite meanings. This includes antonyms, converses and so on.

Participants:
- `Sense` (exactly two)

### `Similarity` relation type
Represents the fact that two or more senses (typically - but not necessarily - belonging to two different entries) have the same or similar meanings. This includes synonyms, near synonyms, immediate hypernyms/hyponyms and so on.

Participants:
- `Sense` (two or more)

### `Pertainment` relation type
Represents the fact that two or more senses (typically - but not necessarily - belonging to two different entries) are related to each other, in ways other than opposition and similarity.

Participants:
- `Sense` (two or more)


## DMLex Inline Markup Module

### `Placeholder` markup type
Marks up a substring inside a headword (or inside a headword translation) which is not part of the expression itself but stands for things that can take its place, or constitutes some kind of meta-notation. Examples:
- `beat [sb.] up`
- `continue [your] studies`

Markup of:
- `headword` property of `Entry`
- `text` property of `HeadwordTranslation`

### `Headword` markup type
Marks up a substring inside an example (or inside an example translation) which corresponds to the headword (or to a translation of the headword).

Markup of:
- `text` property of `Example`
- `text` property of `ExampleTranslation`


## DMLex Controlled Vocabularies Module
This module makes it possible to describe constraints on the values of certain plain-text properties of objects defined in DMLex Core and in DMLex Bilingual Module.

### Extensions to `LexicographicResource` object type

Additional properties:
- `labelLanguage` (IETF language code)
	- The language of the display values of labels.	

Additional children:
- `PartOfSpeechLabel` (zero or more)
- `TranslationPartOfSpeechLabel` (zero or more)
- `UsageLabel` (zero or more)
- `TranslationUsageLabel` (zero or more)
- `InflectedFormLabel` (zero or more)
- `TranslationInflectedFormLabel` (zero or more)

### `PartOfSpeechLabel` and `TranslationPartOfSpeechLabel` object types
- A `PartOfSpeechLabel` represents one of several allowed values for the `value` property of `PartOfSpeech` objects.
- A `TranslationPartOfSpeechLabel` represents one of several allowed values for the `value` property of `TranslationPartOfSpeech` objects.

Properties:
- `value` (non-empty string)
- `displayValue` (optional)

Children:
- `LabelMapping` (zero or more)

### `UsageLabel` and `TranslationUsageLabel` object types
- A `UsageLabel` represents one of several allowed values for the `value` property of `Label` objects.
- A `TranslationUsageLabel` represents one of several allowed values for the `value` property of `TranslationLabel` objects.

Properties:
- `type` (one of: `normative`, `register`, `temporal`, `geographic`, `sociocultural`, `domain`, `frequency`, `attitude`)
- `value` (non-empty string)
- `displayValue` (optional)

Children:
- `LabelMapping` (zero or more)

### `InflectedFormLabel` and `TranslationInflectedFormLabel` object types
- An `InflectedFormLabel` represents one of several allowed values for the `label` property of `InflectedForm` objects. Properties and children same as above.
- A `TranslationInflectedFormLabel` represents one of several allowed values for the `label` property of `TranslationInflectedForm` objects.

Properties:
- `value` (non-empty string)
- `displayValue` (optional)

Children:
- `LabelMapping` (zero or more)

### `LabelMapping` object type
Represents the fact that an item in the controlled vocabulary is equivalent to item provided by en external authority.

Parents:
- `PartOfSpeechLabel`
- `TranslationPartOfSpeechLabel`
- `UsageLabel`
- `TranslationUsageLabel`
- `InflectedFormLabel`
- `TranslationInflectedFormLabel`

Properties:
- `sameAs` (URI)


Attachment: modules.pdf
Description: Adobe PDF document



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]