Re: Comments on the Generalization section

Having reviewed Gershon’s comments on the Generalization section I have formulated the following treatise on generalization and the implications for what we should and should not do in the 2.0 spec:

Generalization is inherent in the DITA specialization design. Because all element type information is specified in @class attributes and attribute specialization information is specified in @specialization attributes, and because we define the grouped syntax for @props and @base, which removes the need for literal attributes, everything you need to know about a given element instance from a DITA processing perspective (as opposed to a grammar validation perspective) can be determined from the associated @class and @specialization attributes without regard to element tag names or attribute names.

This means that the generalized and respecialized forms of DITA documents are also inherent, meaning that there is no need to explicitly specify a transformation process to create the generalized and respecialized forms of DITA documents because such transforms are both obvious to anyone who might need to create one and the implementation of any such transform is a simple exercise in programming: generalization and respecialization transforms present no particular technical challenge or non-obvious edge cases.

Another way to look at this is the sort of quantum nature of DITA documents: for a given DITA document instance, all of its generalized and respecialized versions exist at the same time as potential ways of viewing the document and only come into existence when the document is “observed” by writing it to its XML (or HDITA or MDITA) representation or otherwise treating it as concrete XML.

Whether you choose to literally generalize or respecialize a DITA document into a new document with different markup details (but equivalent structure and semantics because @class and @specialization say everything that needs to be said) is entirely an implementation detail of a specific DITA processing environment.

Also important is that there is only one right way to view the generalized or respecialized form of a given document: The @class values are unchanged, the @specialization values are unchanged, and element types and attribute names reflect any values allowed in each element’s @class and @specialization values. That is, two generalized or specialized views of a given DITA document may differ in the tag names and attribute names they reflect but their DITA-defined semantics are unchanged (and cannot be changed because @class and @specialization convey all DITA-defined semantics for element types and attributes).

Given that, there is no reason for the DITA specification to define literal generalization and retransformation processes and thus no reason to have normative rules for them. This is so both because the DITA specification should not define transformation processes at all and because there is nothing that requires normative statements because generalization and respecialization are inherent in the specialization facility design. There is nothing normative you could say that isn’t a direct consequence of how specialization works.

My reading of the current Generalization specification is that it reflects the early days of DITA when the specification reflected the specific way that DITA documents were processed at IBM and then more generally by DITA Open Toolkit—in that context it makes sense to specify a particular implementation approach to doing generalization. But what the current specification describes is just one of many implementation approaches.

Therefore, in the DITA 2.0 specification I think it is only necessary to define generalization and respecialization as inherent properties of DITA documents, direct implications of the specialization facility.

The DITA 2.0 specification should not, itself, define any sort of generalization or respecialization process nor should it define any normative requirements for the results of such processes (such as the current normative rule that you cannot have the same condition name as both an attribute name and a group name in a separate attribute on the same element).

In such a discussion of generalization it is probably useful to point out that there are two main use cases for producing new generalized document instances: migration (one-way down translation) and interchange (round tripping through generalized instances). But the specification of *how* to produce these instances is outside the scope of the DITA spec.

The current information specific to generalization and respecialization transforms might be useful as a committee note to serve as guidance to DITA system implementors but really, any engineer implementing a DITA processing system who understands specialization should immediately see both how generalization is inherent in the specialization facility and how to implement it most appropriately for their users.

My specific recommendation:

Rewrite the Generalization topic to define and describe it as I’ve outlined above: as an inherent property of DITA documents that may be used in many ways by DITA processors. Remove all discussion of generalization and respecialization processing, including current normative statements. This basically means making current section 8.4.1 Overview of Generalization the entire Generalization topic and removing 8.4.2 to 8.4.5 (and optionally moving that content to a new committee note on generalization transformations, if we think one is needed).

I am happy to take the action to draft this rework of the Generalization topic.

If the TC decides to keep the current generalization content and normative statements, this normative statement is not needed:

068 (410) A single element MUST NOT contain both generalized and specialized values for

the same attribute.

This is not needed because the grouping syntax for @props already accounts for this case (from 7.4.5 Conditional processing attribute values with groups):

If two groups with the same name are found in a single attribute, they are treated as if all values

are specified in the same group. The following values for the @otherprops attribute are

equivalent:

Assuming that DITA processors normalize all specializations of @props to their equivalent grouped form (which would be the obvious implementation approach since you have to be able to handle the grouped form in any case), the case of an attribute and a group with the same name is explicitly handled: You’ll end up with two groups with the same name and their values will be combined per the quoted rule. Thus this case cannot be an error (in the sense that it would cause some ambiguity about the meaning of the content) and therefore there is no need to make it a normative rule. (The lack of this rule also makes it clearer that normalizing all specializations of @props to their grouped form as an implementation strategy is the easiest way to ensure you get the correct answer in all cases.)

Having said that, as part of a specification for how you want your generalization processor to work, it’s perfectly reasonable to specify that you want this case to be avoided, but that’s part of the specification of a particular generalization processor, not something that needs to be normatively required even if the DITA spec generally defines generalization processing rules.

Cheers,

_____________________________________________

Eliot Kimber

Sr Staff Content Engineer

O: 512 554 9368

M: 512 554 9368

servicenow.com

LinkedIn | Twitter | YouTube | Facebook

From: dita@lists.oasis-open.org <dita@lists.oasis-open.org> on behalf of Gershon Joseph <gershon@precisioncontent.com>
Date: Tuesday, September 6, 2022 at 6:24 AM
To: kris eberleinconsulting.com <kris@eberleinconsulting.com>, dita <dita@lists.oasis-open.org>
Subject: [dita] RE: Comments on the Generalization section

[External Email]

Thanks Kris for pulling this out.

Adding some missing information and correcting a typo in one of my comments. I updated the table below.

Unlock the Knowledge in Your Enterprise™

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Please notify us by return email if you have received this email in error. © 2022, Precision Content Authoring Solutions Inc. Toronto, Ontario, Canada

gershon joseph
senior information architect

From: kris eberleinconsulting.com <kris@eberleinconsulting.com>
Sent: Tuesday, September 6, 2022 1:08 PM
To: Gershon Joseph <gershon@precisioncontent.com>; dita <dita@lists.oasis-open.org>
Subject: RE: Comments on the Generalization section

OK, I went through Gershon’s PDF and extracted his comments. Below is a summary:

Topic	Comments	Notes
8.4.1 Overview of generalization	Why is this paragraph not a normative rule while the next one is? Surely both should either be normative rules or neither should be normative rules?
8.4.2 Element generalization	Do we care whether they do this or use some other method to generalize? I suspect this is part of the processing stuff we are trying to remove in 2.0, right? I think we can rewrite this without making it a process, but explaining how the element names would change.
8.4.2 Element generalization	rework this paragraph to make it clearer. It seems to assume mostly that there in is only a single specialization. We should make it more easily understood when more than two levels of specialization are involved.
8.4.3 Processor expectations when generalizing elements	Remove this topic from the spec.
8.4.3, normative rule 066 (410)	Rework this requirement into the generalization topic after the paragraphs or rules that talk about the @class attribute values.
8.4.4 Attribute generalization	This indeed seems to be spec material. Interesting. Need to figure out where to put this in the new generalization topic. Hopefully once we rework the element generalization topic this attribute generalization will fit. Re the example: Why give an example that's wrong? If this example has any value, then introduce it as an error condition that should be avoided.
8.4.5 Generalization with cross-specialization dependencies	In the third paragraph, “However, codeConcept<> could be generalized to concept or topic…” my comment was to change “codeConcept<>” to “<codeConcept>”.

Best,

Kris

Kristen James Eberlein
Chair, OASIS DITA Technical Committee
Owner, Eberlein Consulting LLC
kris@eberleinconsulting.com

Skype: kriseberlein; voice: +1 (919) 622-1501

From: dita@lists.oasis-open.org <dita@lists.oasis-open.org> On Behalf Of Gershon Joseph
Sent: Sunday, September 4, 2022 10:47 AM
To: dita <dita@lists.oasis-open.org>
Subject: [dita] Comments on the Generalization section

Hi all,

I have removed all section except for the Generalization section from the attached PDF. Please review my comments and edits. Kris will put this on the agenda of an upcoming meeting to discuss what to do with the Generalization content.

Cheers,

Gershon

Unlock the Knowledge in Your Enterprise™

gershon joseph senior information architect
180 John St. Toronto, ON Canada M5T 1X5 T: (647) 557-5965 gershon@precisioncontent.com
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Please notify us by return email if you have received this email in error. ©2022, Precision Content, Toronto, Ontario, Canada

dita message