OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

xliff message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Fw: Please find a pdf attached.

FYI. I received this by the way of Unicode Consortium recently. Feel free to agree or disagree with my resposne.

I do have a question on something he mentioned. I have NOT seen any production usage of pictograph (or emoji) symbols coming through our translation pipeline at all. In the context of text to speech for the purpose of translation/localization, what is the common practice? It is not an XLIFF question per se...

----- Forwarded by Helena S Chapman/San Jose/IBM on 04/30/2014 12:07 PM -----

From:        Helena S Chapman/San Jose/IBM
To:        William_J_G Overington <wjgo_10009@btinternet.com>
Cc:        "uli@unicode.org" <uli@unicode.org>
Date:        04/30/2014 12:06 PM
Subject:        Re: Please find a pdf attached.

William. If you look at XLIFF in details, you might find that though it comes with lots of bells and whistles, the core module itself is not as overwhelming as you expect. People always have the option to pick and choose additional extensions including namespaced custom extensions.

I'd like to also learn what real life use case exists today for readouts.dat outside of XLIFF? You mentioned transform from readouts.dat to XLIFF can happen. I agree and that's the beauty of XLIFF where a universally, well informed localization enthusiasts do use it for interchange purposes. Some tooling might even use it as an internal processing format today though that was not the original intent. So, unless there is a very good business reason we need something like readouts.dat to exist, my recommendation at this point is to preserve it for organization proprietary processing purpose only.

Thank you.

Best regards,

Helena Shih Chapman
Globalization Technologies and Architecture
+1-720-396-6323 or T/L 938-6323
Waltham, Massachusetts

From:        William_J_G Overington <wjgo_10009@btinternet.com>
To:        Helena S Chapman/San Jose/IBM@IBMUS
Cc:        "uli@unicode.org" <uli@unicode.org>
Date:        04/30/2014 11:57 AM
Subject:        Re: Please find a pdf attached.

Hi Helena

Thank you for your email.

> I'd like to get a better understanding of what problems you are attempting to solve that is not already solved by open standards such as OASIS XLIFF?

Well, until I read your email I knew nothing at all about XLIFF.

I found the following.




I have not studied it all yet.

From looking at the example in the
http://en.wikipedia.org/wiki/XLIFF web page, it is possible that the answer to your question is that the readouts.dat format will not solve any problem that could not also be solved using XLIFF.

However, the readouts.dat format is very lightweight and could perhaps be useful in specific situations, such as, for example, if someone who is able to translate into Latvian from English wants to copy a file of emoji to English read-out labels and localize it and produce a file of emoji to Latvian read-out labels using just WordPad on an ordinary PC.

Would that be useful if, say, a manufacturer of a text-to-speech system included the ability to read in and apply a readouts.dat file, thereby giving an end user the opportunity to customize the system using a readouts.dat file that he or she had prepared using just WordPad and no special software tools?

It appears that in relation to mainstream industrial use where XLIFF is in use and all of the knowledge, skills and facilities are available to use it that there may well be no use for the readouts.dat format.

Yet perhaps the readouts.dat format might find use as an easy to learn and easy to use format for use by an end user in customizing text-to-speech software packages for use with a particular language, particularly when the language is not one specifically supported by the manufacturer of the software package.

Maybe there could be a software tool that reads in a readouts.dat file and produces an XLIFF file that contains all of the localization information that is in the readouts.dat file.

When I saw the original HTML release of the text that is now in a pdf document in the Unicode Technical Committee Document Register as document L2/14-093, in particular the text that is now the first complete paragraph on page 4 of the L2/14-093 document, I thought that the sentence.dat format that I had produced for my research in communication through the language barrier could be adapted to produce an easy to use format to localize emoji to text: the result is the readouts.dat format.

I am attaching two publications from my research as that background might be of interest.

Also, a publication about a simplified spin-off variant that could perhaps be useful in transliteration.

I fully accept that the readouts.dat system is not as comprehensive as XLIFF, yet I am wondering notwithstanding that, whether the readouts.dat format might be useful in some circumstances: for example if someone is developing a text-to-speech system and needs a way to allow an end user to supply to the system a user-supplied customized localization file for the text equivalent of each of a number of pictograph characters and symbols.

I would be interested to know your comments if possible please.

Yours sincerely

William Overington

30 April 2014
[attachment "Communication_through_the_language_barrier_in_some_particular_circumstances_by_means_of_encoded_localizable_sentences.pdf" deleted by Helena S Chapman/San Jose/IBM] [attachment "The_format_of_the_sentence.dat_file_used_for_automated_localization_of_encoded_localizable_sentences.pdf" deleted by Helena S Chapman/San Jose/IBM] [attachment "The_format_of_the_translit.dat_file_suggested_for_possible_use_for_transliteration.pdf" deleted by Helena S Chapman/San Jose/IBM]

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]