OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

dita-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Public Comment

Comment from: deborah_pickett@moldflow.com

Name: Deborah Pickett
Organization: Moldflow
Regarding Specification: Filtering in DITA

Recommendation for better usability of DITA filters.

The current filtering specification in DITA is adequate for a subset of
filtering use cases, but it is awkward to use in more sophisticated
situations where there is not a fixed, known set of filtering property
values, or if properties naturally fall into a hierarchy.

The specification is not clear about what to do if a ditaval file does not
say whether to include, exclude or flag a particular property value.

I propose the following three additions to the specification.  The three
proposals are orthogonal; none depends on another being accepted.

1. It is an error for a filtering property value to not be mentioned by
at least one rule in the ditaval file.  This error should result in either
a warning message or halt processing, at the processor's discretion.

Rationale: This prevents conditional text from being accidentally included
or excluded, potentially very costly.  The re-user is forced to make a
decision about the property and amend the ditaval file accordingly.

(Example: If I say <ph product="foo"> but no line in the ditaval file
matches product "foo" then it is an error.)

2. Values in the ditaval file are regular expressions, not literal strings.
These regular expressions are tested against the space-delimited values
in the DITA element, and if they match the include/exclude/flag is
performed.  Regular expressions are assumed to be anchored at start and
finish (i.e., ^...$).  The regular expression dialect is Java's.

Rationale: This allows re-users to say "and exclude anything I didn't
mention" for a property.  By judicious use of delimiters and match patterns
it is possible to create multi-level property-value hierarchies.

By allowing arbitrary regular expressions, re-users are able to invent
whatever hierarchy they find suits their needs.  This requires a little more
work in writing the ditaval file but authoring tools might be able to offer

(Example: I create the following property values for platform:
  "os/unix" "os/windows" "os/windows/64bit" "os/windows/32bit"
  "arch/intel" "arch/powerpc" "arch/arm"
I can produce generic Windows-on-Intel documentation by filtering, using
[^/]+ to mean "any non-slash-containing sequence":
  os/[^/]+: exclude
  os/windows: include
  os/windows/[^/]+: flag
  arch/intel: include
  arch/[^/]+: exclude
In this example, 32-bit and 64-bit specifics are flagged.)

3. Each include/exclude/flag line in the ditaval file can optionally specify
a priority for that inclusion or exclusion.  The priority is a floating-point
value.  If multiple lines match a property value, the line with the
numerically highest priority takes precedence.  It is an error for two
matches to have the same priority only if one is "exclude" and the other is
"include" (likewise for combinations with "flag"). Default priorities for
lines are:
  exclude, no metacharacters in the regular expression: 0.0
  include, no metacharacters in the regular expression: 1.0
  flag, no metacharacters in the regular expression: 2.0
  exclude, metacharacters in the regular expression: -1.0
  include, metacharacters in the regular expression: -0.5
  flag, metacharacters in the regular expression: 1.5
Metacharacters are unescaped [ ] { } . ( ) \ | + ? * ^ $ &

Rationale: This codifies the spec's default "exclude, then include" rule
but allows it to be overridden in situations where it is needed.  By
demoting patterns with metacharacters, it is possible to specify typical
use-cases without needing to explicitly state priorities.

This proposal is inspired by default rule priorities in XSLT.

(Example: To include all audiences except "manager"s, which may contain
sensitive internal information, and to flag anything that is marked for
the "technical" hierarchy (technical/developer, technical/administrator),
I could write:
  .+: include                  [ default priority -0.5 ]
  manager: exclude             [ default priority 0.0 ]
  technical/.*: flag           [ default priority 1.5 ]
  technical/manager: exclude, priority = 2.0
The last line ensures that the "technical manager" audience, whatever that
is, audience is excluded no matter what.)

Impact on documentation: ditaval files that are to specification for
DITA 1.0 will continue to behave as before, with the exception of
property values that use regular expression metacharacters, which will
need to be escaped.

Impact on authoring tools: Authoring tools that already use some kind of
pattern matching (e.g., FrameMaker) may speak a different dialect of
pattern matching.

Impact on toolchain: For DITA-OT, confined to
Requires Java 1.4 or later for regular expression API.

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]