[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [lexidma] follow up on two-level senses
(2) there is very limited agreement on this similarityActually, I don't think this is merely similarity that is being encoded, in fact mostly these sense groupings are based on ideas of systematic polysemy (as introduced by authors such as Pustejovsky [1] and Buitelaar [2]) and complementary and contrastive senses (such as described by Weinreich [3]). TheseÂare real linguistic phenomenon and still motivate modern electronic lexicographic efforts [4].Â
(3) there are many possible way how this similarity can be defined and seen, allowing this means being closer to how language/word senses work
(4) the fact that it was encoded in a hierarchical way that only allows one-dimensional structure merely comes from the limits of a printed dictionaryI am not sure I agree with this... partly for the reasons stated above, but moreover, users do not want to use an electronic dictionary as some free-form graph structure. This is something that I have learnt from WordNet, that presenting the data as a flat text structure (e.g., https://en-word.net/) is more effective than through a graph diagram. As such, I think in both presentation and production of dictionaryÂcontent, hierarchical groupings are still very useful.Â
(5) this alternative solution therefore enables all this, and much more, if needed, without introducing additional complexity.
I think that the labels generally could use a similar notation that David mentioned for PoS tagging, with prefix denoting type of label, e.g. "sensegroup:1" or "sensegroup:etymology1" and similar but that is to be discussed.From a technical point of view, there are also disadvantages to this. You are still encoding hierarchical senses, but now you are doing it in a way that is harder to work with in XPath and many other technologies, which in turn makes it harder for data creators to verify consistency.ÂI would suggest that this is implemented as an optional sense grouping tag, e.g,<senseGrp>Â <sense id="..."><defn></defn></sense>Â <sense id="..."><defn></defn></sense></senseGrp>
Also, I would note that this discussion is only really about grouping senses. Grouping entries is more questionable but is often motivated by linguistic phenomena like derivation, grouping etymologically distinct forms of the same word (e.g., 'bank' can be first grouped into subentries based on its Germanic/Italian/French etymologies) or morphologically distinct forms (e.g., the unique dative singular found in the seventh sense here). We should at least consider these requirements on the representation and have a plan to represent them in the model.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]