RE: [emergency] GJXDM Reuse and Specific Technical Issues

Hi Everyone,

Before I dig into specifics, let me state three things in relation to the previous email thread on the “emergencyType” differences.

As a developer myself, I always believe technical issues trump politics, turf or bias. I am always happy to keep discussions at the technical level.
This is a good time for this discussion because we need to resolve best practices on reuse of a set of core XML Schema types across domains. My current thought is that GJXDM may have defined most of those core types so we need to determine how to separate them from any domain-specific content. This work has ramifications both within this group and externally to other efforts under way. From my reading of the TC requirements document, though I disagree with its definition of “core” (more on that below), this seems to be inline with the section on EM Core and Metadata Standards.
Relating specifically to GJXDM, my discussions and experience with the OJP folks is that they are very willing to work with people to improve the model. That, of course, brings us back to focusing on technical issues on any deficiencies on GJXDM and not vague difficulties like the size of the model or preferences in naming rules.

So, now focusing on specifics:

First, some thoughts on “emergencyType”:

There are two separate issues here under consideration: the naming design rule and the difference in modeling approach. The naming design rule is debatable and probably arbitrary as long as your naming is consistent. The modeling approach and semantic differences between the concepts is, in my opinion, the more important issue.

To me, the first thing to get consensus on is how we model an event. In the Distribution schema you use an event type enumeration; however, in the CAP schema you are using a string. One problem in this setup is that you could not validate before hand whether the event type in the distribution header was the same as (or compatible with) the description of the event in the CAP message. In comparison to GJXDM, an event type code to distinguish events could be modeled as a constraint on the event name or as another attribute in the GJXDM Event Type (akin to a Class). Or, if the event types are significantly different you may want to extend Event Type to add attributes. For extensibility and flexibility, I would prefer an Event as a Class with an optional category property to classify (or categorize) the event within a taxonomy. In recognition of the fact that there may be other existing taxonomies that people want to categorize their events under, the “EventCategory” element should have a qualifier attribute, possibly called “source”, to identify an external classification scheme (as is done in ebXML and UDDI registries).

Some thoughts on GJXDM:

I am not part of DOJ and have no bias toward this group accepting or rejecting the work. In my opinion, it should stand or fall on its own merits. However, having said that, my examination of it so far has been mostly positive.
I have minor issues with some of the naming, for example “SuperType” for the root of all content types is not in line with object-oriented modeling where a “SuperType” is any Type “above” (or parent of) the current type. However, naming is something that is just not worth debating over. In my opinion, the way to handle naming is to separate the labels from the concept identifier. As in OWL or topic maps, you should allow multiple labels (or names) for a single concept as long as the concept has a single unique identifier.
The key modification to GJXDM I would like to see is for it to be more modular. I am meeting with OJP and the Georgia Tech folks in January to discuss this in more detail. Your thoughts and input in this area would be very beneficial. The most important area of modularity is distinguishing between core entities and domain-specific ones. I define “core” as any data element that crosses more than one domain. Within that set, I call “universal core” those entities that cross all domains. Thus, at a minimum, you would have two “core” schemas (universal core and “domain-intersections” or “Community of Interest (COI) core”) and then domain-specific schemas. Given the above, reuse would involve extending core entities to create domain-specific entities. One business use case for this is that a federated query would be able to span all core entities across domains.

Other miscellaneous thoughts:

1. On the Distribution schema – what is the functional requirement for this? I say that especially in light of the fact that it is an optional element in the Standard Message Format document. If the business case was strong enough, I would not think this would be optional. In general, I like the concept but am concerned that such a potentially complex issue should be designed after a clear articulation of the functional requirements we are trying to solve. Especially, when some could argue that distribution should not be determined by the originator but dynamically based on a publish/subscribe approach (with proper authentication). Also, there seems to be some overlap between the elements in CAP and the elements in the distribution schema.

Well, that’s enough for now. I look forward to working with you on these issues.

Regards,

Michael C. Daconta

Metadata Program Manager

Department of Homeland Security

tel: (202)692-4340

email: michael.daconta@dhs.gov

emergency message