OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

wsn message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Analysis of filtering


I believe the general question is to determine what, if anything, we can say about constructs like

<filter>
    <selector> ... </selector>
    <topic> ... </topic>
    <selector> ... </selector>
    <sggFilter> ... </sggFilter>
    <dmhFilter> ... </dmhFilter>
    <topic> ... </topic>
    <precondition> ... <timeLimit time="now"> ... </precondition>
    <sideOrderOfGrits> ... </sideOrderOfGrits>
</filter>

Currently we say that the order and timing of evaluation is up to the NP.  This leaves the NP free to do whatever works best, but it's only safe from an interop point of view if this freedom can't make an externally visible difference.  That is, if two NPs differing only in their evaluation strategies (say, V1.x and V2.x of the same NP) could produce different results under otherwise identical circumstances, we would have an interop issue.

Currently, everything we've defined to go into <filter> just subsets the universe of possible notifications, and we believe everything commutes with everything else.  Further, we implicitly intersect (i.e., AND) all the children of the <filter> to get the effective filter.

However, we also allow arbitrary open content (sggFilter, dmhFilter, sideOrderOfGrits, etc.).  To preserve indifference to order of evaluation and the implicit ANDing, we need to require filtering extensions to commute with the existing filters and with each other and with all future filters, known and unknown.

This seems difficult.

For example, suppose dmhFilter has a "suppress all but every Nth message" option.  This seems like a perfectly reasonable, if arbitrary, filter on its own.  But it won't commute with anything.  E.g., suppose only children of filter are
  • selector, message contains foo
  • dmhFilter, every other message
and the stream of notifications is
  • foo 1; bar 2; foo 3; bar 4; foo 5; bar 6; foo 7; bar 8.
Then filtering by selector first gives
  • foo1; foo 5
while filtering by dmhFilter first gives
  • foo 1; foo 3; foo 5; foo 7
In short, it seems like order matters, even in otherwise nicely-behaved completely functional (i.e., side-effect free) setups.

IMHO, the problem stems from trying to avoid nested structure and instead use an implicit evaluation rule, in this case "AND together in any order".  If instead, precondition, topic and selector acted like functions, we could compose them together under the usual rules.  E.g., topic, selector and precondition take sequence of messages to sequence of messages, and with no arguments the implicit argument is "all possible notifications from this producer".

Then, as it happens, these are all equivalent:
  • selector("contains foo", topic("fooTopic" precondition("It's Tuesday)))
  • topic("fooTopic", selector("contains Foo", precondition("It's Tuesday")))
  • precondition("It's Tuesday", selector("contains Foo", topic("fooTopic")))
  • etc.
but if I define dmhFilter, then these two are different
  • dmhFilter(2, selector("contains Foo")) (gives foo1; foo5)
  • selector("contains Foo", dmhFilter(2)) (gives foo1; foo3; foo5; foo 7)
There's no ambiguity because we say exactly what we want.  On the wire, this will be represented as

<filter>
    <dmhFilter modulus="2">
       <selector>contains Foo</selector>
    </dmhFilter>
</filter>

or the same thing inside out.  The NP has to walk the tree and figure out what's going on, but it has to do that anyway.  None of the forms we define is hard to evaluate from a language design point of view, and if someone defines something Turing-complete or whatever, it's up to them to get people to support it.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]