[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Public Comment
Comment from: deborah_pickett@moldflow.com Name: Deborah Pickett Title: Organization: Moldflow Regarding Specification: Filtering in DITA Recommendation for better usability of DITA filters. The current filtering specification in DITA is adequate for a subset of filtering use cases, but it is awkward to use in more sophisticated situations where there is not a fixed, known set of filtering property values, or if properties naturally fall into a hierarchy. The specification is not clear about what to do if a ditaval file does not say whether to include, exclude or flag a particular property value. I propose the following three additions to the specification. The three proposals are orthogonal; none depends on another being accepted. ----- 1. It is an error for a filtering property value to not be mentioned by at least one rule in the ditaval file. This error should result in either a warning message or halt processing, at the processor's discretion. Rationale: This prevents conditional text from being accidentally included or excluded, potentially very costly. The re-user is forced to make a decision about the property and amend the ditaval file accordingly. (Example: If I say <ph product="foo"> but no line in the ditaval file matches product "foo" then it is an error.) ----- 2. Values in the ditaval file are regular expressions, not literal strings. These regular expressions are tested against the space-delimited values in the DITA element, and if they match the include/exclude/flag is performed. Regular expressions are assumed to be anchored at start and finish (i.e., ^...$). The regular expression dialect is Java's. Rationale: This allows re-users to say "and exclude anything I didn't mention" for a property. By judicious use of delimiters and match patterns it is possible to create multi-level property-value hierarchies. By allowing arbitrary regular expressions, re-users are able to invent whatever hierarchy they find suits their needs. This requires a little more work in writing the ditaval file but authoring tools might be able to offer help. (Example: I create the following property values for platform: "os/unix" "os/windows" "os/windows/64bit" "os/windows/32bit" "arch/intel" "arch/powerpc" "arch/arm" I can produce generic Windows-on-Intel documentation by filtering, using [^/]+ to mean "any non-slash-containing sequence": os/[^/]+: exclude os/windows: include os/windows/[^/]+: flag arch/intel: include arch/[^/]+: exclude In this example, 32-bit and 64-bit specifics are flagged.) ----- 3. Each include/exclude/flag line in the ditaval file can optionally specify a priority for that inclusion or exclusion. The priority is a floating-point value. If multiple lines match a property value, the line with the numerically highest priority takes precedence. It is an error for two matches to have the same priority only if one is "exclude" and the other is "include" (likewise for combinations with "flag"). Default priorities for lines are: exclude, no metacharacters in the regular expression: 0.0 include, no metacharacters in the regular expression: 1.0 flag, no metacharacters in the regular expression: 2.0 exclude, metacharacters in the regular expression: -1.0 include, metacharacters in the regular expression: -0.5 flag, metacharacters in the regular expression: 1.5 Metacharacters are unescaped [ ] { } . ( ) \ | + ? * ^ $ & Rationale: This codifies the spec's default "exclude, then include" rule but allows it to be overridden in situations where it is needed. By demoting patterns with metacharacters, it is possible to specify typical use-cases without needing to explicitly state priorities. This proposal is inspired by default rule priorities in XSLT. (Example: To include all audiences except "manager"s, which may contain sensitive internal information, and to flag anything that is marked for the "technical" hierarchy (technical/developer, technical/administrator), I could write: .+: include [ default priority -0.5 ] manager: exclude [ default priority 0.0 ] technical/.*: flag [ default priority 1.5 ] technical/manager: exclude, priority = 2.0 The last line ensures that the "technical manager" audience, whatever that is, audience is excluded no matter what.) ----- Impact on documentation: ditaval files that are to specification for DITA 1.0 will continue to behave as before, with the exception of property values that use regular expression metacharacters, which will need to be escaped. Impact on authoring tools: Authoring tools that already use some kind of pattern matching (e.g., FrameMaker) may speak a different dialect of pattern matching. Impact on toolchain: For DITA-OT, confined to org/dita/dost/writer/DitaWriter.java. Requires Java 1.4 or later for regular expression API.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]