relax-ng message

Subject: [relax-ng] Algebraic characterization of RELAX NG patterns

From: Murata Makoto <mura034@attglobal.net>
To: relax-ng@lists.oasis-open.org
Date: Sun, 21 Oct 2001 22:55:26 +0900

Algebraic characterization of RELAX NG patterns

2001 10/21
Murata Makoto
EB2M-MRT@asahi-net.or.jp

1. Introduction

I have argued that infinite-nameclass <attribute> should be allowed only when 
it is a descendant of <oneOrMore>.  With this restriction, we have a nice 
characterization theorem of RELAX NG patterns.

2. Regular languages and syntactic monoids

We first consider regular languages.  It is well known that regular 
expressions and finite automata exactly capture regular languages.  
Meanwhile, another interesting characterization is given by syntactic 
monoids.

In preparation, let $\Sigma$ be a finite set.  The set of strings over 
$\Sigma$ is denoted by $\Sigma^*$.

Theorem 1: 

A subset $L$ of $\Sigma^*$ is regular if and only if 
$L$ is the inverse image of $X$ under $\phi$, where 


   - $N$ is a finite monoid; that is,

        - $N$ is a finite set,

        - $N$ has a binary operation $\circ$,

        - $\circ$ has an identity $0$ (i.e., 
          $0 \circ n =  n \circ 0 = n$ for every $n$ 
          in $N$), and 

        - $\circ$ is associative; that is, 
          $(s_1 \circ s_2) \circ s_3 = s_1 \circ (s_2 \circ s_3)$,
        
    - $\phi$ is a homomorphism from $\Sigma^*$ to $N$; that is, 
      $\phi(u v) = \phi(u) \circ \phi(v)$ for every $u, v$ in $\Sigma^*$, 
      and

    - $X$ is a subset of $N$.


3. Generalization for RELAX NG patterns

For sake of simplicity, we ignore <text>, <list>, <data>, <value>.  We
also assume that any pattern is allowed as the child of <start>.

In prepartion, let $\Delta$ be a finite set of disjoint name classes
for attributes, and let $\Sigma$ be a finite set of disjoint name
classes for elements.

We can nicely generalize Theorem 1 as below.

Theorem 2: 

A subset $L$ of $(2 ^ \Delta) \times \Sigma ^*$ is described by a
RELAX NG pattern over $\Delta$ and $\Sigma$ if and only if $L$ is the
inverse image of $X$ under $\xi \oplus \phi$, where

    - $M$ is a finite set such that 

        - $M$ has a binary operation $+$,

        - $+$ has an identity $0$ (i.e., 
          $0 + m =  m + 0 = m$ for every $m$ 
          in $M$), 

        - $+$ is associative; that is, 
          $(m_1 + m_2) + m_3 = m_1 + (m_2 + m_3)$, 
          and

        - $+$ is commutative; that is, 
          $m_1 + m_2 = m_2 + m_1$, and

        - $+$ is idempotent; that is, $m_1 + m_1 = m_1$
        
    - $\xi$ is a homomorphism from $2^\Delta$ to $M$; that is, 
      $\xi(\Delta_1 \cup \Delta_2) = \xi(\Delta_1) + \xi(\Delta_2)$
      for every subset $\Delta_1, \Delta_2$ of $\Sigma^*$

    - $N$ is a finite monoid; that is,

        - $N$ is a finite set,

        - $N$ has a binary operation $\circ$,

        - $\circ$ has an identity $1$ (i.e., 
          $1 \circ n =  n \circ 1 = n$ for every $n$ 
          in $N$), and 

        - $\circ$ is associative; that is, 
          $(n_1 \circ n_2) \circ n_3 = n_1 \circ (n_2 \circ n_3)$.
        
    - $\phi$ is a homomorphism from $\Sigma^*$ to $N$, that is, 
      $\phi(u v) = \phi(u) \circ \phi(v)$ for every $u, v$ in $\Sigma^*$

    - $\xi \oplus \phi$ is a mapping from $2^\Delta \times \Sigma^*$
      to $M \times N$ defined as $\xi \oplus \phi(\Delta_1, u) =
      (\xi(\Delta_1), \phi(, u))$

    - $X$ is a subset of $M \times N$.


Note: Observe that idempotency does not hold, if we allow
infinite-nameclass <attribute> without an ascending <oneOrMore>, 

----
Murata Makoto EB2M-MRT@asahi-net.or.jp

Follow-Ups:
- [relax-ng] Re: Algebraic characterization of RELAX NG patterns
  - From: Kohsuke KAWAGUCHI <kohsuke.kawaguchi@sun.com>
- Re: [relax-ng] Algebraic characterization of RELAX NG patterns
  - From: James Clark <jjc@jclark.com>