RE: HumanML

We have to assume if we discuss human communications that
clarification of intent is one reason to markup information (that is, why XML
instead of just HTML).   We have discussed that we can break down four
or five *useful* categories that might be amenable to markup given some
particular communication.    We have used the category of gestures
and said that these may be verbal, non-verbal and for the moment, that
is convenient.   We have said that other kinds of information are conveyed
that might be useful when interpreting a communication with regard to intent
and that these provide a multi-modal framework for a communication. If
we stay with these simple categories for now, what can we do?

For example, as a thought experiment, let me propose a basic phrase dictionary that:

1. Provides common categories of phrase types
2. Provides sufficient information to know when to use any instance from a
     category.   No AI per se, just basic categories.
3. The categories are attributed.   The language in which they are used
     is attributable (via namespaces).
4. Assume that because of the namespace categorization, we enable
     such a category to be infinitely extended as long as the facts added
     do not introduce a contradiction or where such contradiction is added,
     the contradiction is attributed as well.

Here is a rudimentary beginning of such a knowledge base used in
an element that could be added to my genre language given earlier.

The action element derives from the local language (Genre), the
gesture types come from humanML. Lang is an XML attribute.

*******************************************************************
<action humanML:gesture="Greeting" />

xml:lang="ENGLISH"      xml:lang="HINDI"

Good morning            Suprabhaat
Good night              Shubhratri
Hello!                  Namaste

<action humanML:gesture="intimate" />

xml:lang="ENGLISH"      xml:lang="HINDI"

I love you              Main Tumse Pyar karta hoon

***********************************************************************

We would take these and create RDF triples for the
HumanML attributes as a class/subclass (say,
gesture/greeting).   Then create instances that
have phases in verbal subclasses.    These go into
a knowledge base that any application can use
by query and transformation. In any case, our
HumanML schema has been used to create a
database (objects or tables, i don;'t care)

I'm not comfortable with RDF yet and have to get back to work, so
for argument sake, some Humanmarkup UML class descriptor
realizable in an RDF triple has been stored in a
database we can query. On output, it can be
transformed into the sample genre language.

Thus, the genre language (which is then transformed into say X3D or SVG is:

<genre xmlns="http://mylang/genre"
           xmlns:humanML="http://humanMarkup.org/HumanML/" >

<resourcePool>
<hand id="IndiaGreeting01" culture="India(northern)" renderAs="http://mylang/X3DProtos" >
</resourcePool>

<action humanML:gesture="greeting" xml:lang="Hindi">
<par>
<humanML:idiom>Namaste!</humanML:idiom>
<hand gestureRef="indiaGreeting01" />
</par>
</action>

Or to make this dynamic, we could build the whole thing by a
call to the knowledge base

<genre:action humanML:"greeting" >
<genre:script><[CDATA [
         getGreeting(getCulture((india, northern), getTime(), "http://mylang/X3DProtos"))
]]></genre:script>
</genre:action>

Either way, it depends on how you want to use the HumanML
in annotation, as a knowledge base schema, or direct markup
for an interpreter.

To enable the avatar or human to use these, we could
add more categorical information such as the

o culture (Hindi is a language; India is a country; India is
composed of several sub-cultures). Lots of attributes possible
depending on the gesture. A greeting is fairly innocuous,
but an intimate gesture gets very different results in different
cultures. It isn't impossible to generalize these, and really,
that is the best we can do.   It points out the need for
HumanML to be extensible.

o proxemic data - we may add that as an attribute or let
the database infer it from the culture, but the right distance
from the other person to say this may be important.

rendering - this is really not a a HumanML problem.
It is a channel problem. That is, a wav file or a
VoiceXML text node have different properties.
Depending on the one you use, the avatar lip
information varies in precision and you need a
different index deformation.

How much metadata the instance needs depends
on the context of use and HumanML doesn't really
care about that. It cares that the use in the context
is clear.

So there is nothing impossible here; perhaps tedious which
again is why we need an extensible set of categories.

humanmarkup-comment message