[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [xri] I18n and $ tags
-----Original Message----- From: Dave McAlpin [mailto:dave.mcalpin@epokinc.com] Sent: Friday, July 11, 2003 3:57 PM To: xri@lists.oasis-open.org Subject: [xri] I18n and $ tags I assume internationalization does not apply to the $ tags. For example, there's no internationalized version of $v. Is this correct? Is this ok? Dave *****Drummond replies***** I think it's not only correct, but also a good thing. There should be no need to internationalize the $ space for the following reason: IMHO, the purpose of the $ space is to provide a mechanism for extending the very limited set of reserved chars in 2396 (which we've already had to bust out of in order to add support for xrefs and sub-segments) in order to have sufficient metadata (and extensibility) to describe identifiers in ways that are vital to the act of identification, i.e., language, font, version syntax, query syntax, resolvability, human-readable comment, etc. For this reason, I propose that in Appendix B we state a formal a requirement that the vocabulary in the $ identifier space (note that I don't call it a namespace for the reasons I'm about to argue) be as terse as possible, not just to enforce compactness, but to reinforce that it is an extension of the reserved-symbol-space and not intended to carry linguistic-level semantics. For example, the $l (language) space should, as Nat proposed, use the two-letter codes for languages specified in ISO standard 639 referenced in RFC 1766. It should NOT use full-length equivalents. The proposed $f (font) space for font names would violate this rule if it used full-length English font names. (Furthermore, if we did that, it would beg for internationalization). To avoid both problems, we should try to find a compact font name abbreviation registry that we can reference, similar to ISO 639 for language abbreviations. If we can't find one, and we don't want to create one (at least I don't want to), there is another solution - one that applies nicely to any $ space. In place of an exact, rigorously specified vocabulary, every $ space can also cross-reference common names in the + space. Here's an example of how that would work for a font name: xri:($l/fr).($f/(+Arial)).french-word-in-Arial-font/foo Rather than using "($f/Arial)", which would means "Arial" was formally registered in the "$f" space, the segment "($f/(+Arial)" simply means "Arial" is a common name in the context of a font. I'm not a font expert, but I'd be willing to guess that a large percentage of typographic software would recognize that common name for a font. Furthermore, the xri above would also tell the XRI parser that the common name "Arial" should be interpreted not just in the context of being a font, but specifically being a French name for a font. That should reduce the chance of misinterpretation even further. Use of the + space for real-world common names for metadata like fonts means there is an easy way to apply the 80/20 rule, while leaving it open for the $f space to reference a more exhaustive and non-ambiguous font name abbreviation registry later. Again, I think this rule should be applied across the board to all $ spaces, including language, font, version syntax, query syntax, etc. =Drummond
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]