OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

cam message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [FWD: Automated checking of OASIS schemas and NDR alignment]


FYI - this discussion has been posted to TC chairs list regarding improving quality of OASIS specifications and associated schemas.

Thanks, DW

-------- Original Message --------
Subject: [chairs] Automated checking of OASIS schemas and NDR alignment
From: "David RR Webber \(XML\)" <david@drrw.info>
Date: Sat, August 08, 2009 12:19 pm

Since we are on this topic - the real issue is that XSD schema is a double edged sword - fabulously complex and hence challenging to desk check manually.

We could consider alternative strategies to improve quality - I would suggest a voluntry approach here.

For those that care to use it the OASIS CAM specification provides support for automated verification of XSD schema.

The open source toolkit is available on SourceForge.net in the camprocessor project.

Essentially what is does is ingest the XSD Schema - formulate it using the CAM template as an ABSTRACTION LAYER - that then allows automated inspection by particularly xslt scripting.

The following OASIS specifications have already benefited from this - EDXL, CIQ, and EML and then non-OASIS spec's including PESC, MISMO and then the NIEM.gov work.  The toolkit has been able to detect errors that in some cases have laid dormant for 3 years in published standards.

Here's a brief check list of what the toolkit will check for you -

1) Non-UTF8 characters in the schema

2) Naming and design rules consistency - currently this is drawn from NIEM and CEFACT NDR best practices - but is fully customizable in XSLT scripting

3) Common issues WRT Schema and interoperability (a major OASIS goal?) that are flagged as warnings

4) Build XML dictionary of your schema - that can be loaded into Excel spreadsheet - invaluable in manually inspecting actually what is being used where and the information model itself - and what is new release to release.

A sample of the output of the evaluator is below for one schema from OASIS EML - checking 1,500 item and 3,000 rules - clearly a challenge manually.  

You can see the gap here this illustrates between existing standards (in this OASIS EML case dating back to 2001) and todays expected quality evaluations.  Closing this gap while supporting an existing user community is obviously a challenge - but in EML case - we are at least starting that journey.

I believe it is inevitable that OASIS will move in this direction - automated checking - because the users out there demand ever better quality in OASIS specifications - and also expect industry best practices to be followed when engineering schema - which points to OASIS NDR - and aligning that with existing practice such as NIEM.gov.  And of course eventually we can expect at least government level users to insist on this before approving use of a standard - because internally they are already using these evaluation tools on their own schemas.

Thanks, DW

CAM Template Evaluation

Version 1.16

CAM Template HEADER information:

Description: EML 150 Geodistricting schema template
Owner: OASIS, Copyright 2009.
Date: 2009-08-04T08:02:22
Version: 6.0
  • Number of Elements: 495
  • Number of Attributes: 1019
  • Number of constraint Rules: 3026
  • Annotations: 48
  • Hints: 0
  • Excluded Elements: 0
  • Excluded Attributes: 0

RULES INTEGRITY:

  • No problems found

NAMING AND DESIGN RULES (NDR) ASSESSMENT:

  • use of anyAttribute content model found on:
  • - ManagingAuthority/AuthorityAddress
  • - Name
  • - Contact/MailingAddress
  • - StreetSegment/StreetSegmentMember
  • - BoundaryGeoPath/BoundaryMarkerGeo
  • - BoundaryMarkerGeo/Latitude
  • - BoundaryMarkerGeo/Longitude
  • - PhysicalLocation/Address
  • - PollingPlace/PostalLocation
  • - District/PollingPlaceGeoLocation
  • - PollingPlaceGeoLocation/Latitude
  • - PollingPlaceGeoLocation/Longitude
  • use of any content model found on:
  • - EML
  • - EML/ManagingAuthority
  • - ManagingAuthority/ResponsibleOfficer
  • - ResponsibleOfficer/Contact
  • - Seal/OtherSeal
  • - Description/Message
  • - District/ManagingAuthority
  • - DivisionDescription/Message
  • attribute name does not match representation term rules: 21
    - DisplayOrder - Usage - AddressKey - AddressKeyRef - ValidFrom - ValidTo - Abbreviation - Preferred - Mobile - Verified - Problem - Notes - Role - Algorithm - Encoding - Qualifier - Lang - note - Meridian - Direction - Channel
  • element name does not match representation term rules: 21
    - MessageLanguage - RequestedResponseLanguage - Responsibility - Extension - PreferredContact - Binary - Stylesheet - OtherSeal - Accepted - Message - Area - Association - PollAssociation - Latitude - Longitude - PollingStation - Start - End - Notes - FacilityService - FacilityAccess
  • attributes missing annotation description: 29
    - DataQualityType - ValidFrom - ValidTo - Type - NameType - NameCode - NameCodeType - Abbreviation - TypeCode - ElementType - Preferred - Mobile - ItemType - Verified - Problem - Notes - Role - Format - Algorithm - URI - MimeType - Encoding - Qualifier - Lang - divisionType - NumberingMode - pollingEventType - CurrentStatus - DateTime
  • attribute names not beginning with lower case letter (a-z): 59
    - Id - SchemaVersion - IdNumber - DisplayOrder - Type - AddressID - AddressIDType - ID - Usage - DeliveryMode - Status - AddressKey - AddressKeyRef - DateValidFrom - DateValidTo - DataQualityType - ValidFrom - ValidTo - LanguageCode - NameType - NameCode - NameCodeType - Abbreviation - TypeCode - PersonID - PersonIDType - NameKey - NameKeyRef - ElementType - Preferred - Mobile - ItemType - Verified - Problem - Notes - Role - Format - Algorithm - URI - MimeType - Encoding - Qualifier - Lang - IdType - Mode - SequenceNumber - NumberingMode - SegmentType - Meridian - MeridianCodeType - DatumCodeType - ProjectionCodeType - DegreesMeasure - MinutesMeasure - SecondsMeasure - Direction - Channel - CurrentStatus - DateTime
  • elements missing annotation description: 57
    - EML - TransactionId - SequenceNumber - NumberInSequence - SequencedElementName - AdditionalValidation - Location - Type - MessageLanguage - RequestedResponseLanguage - AuthorityIdentifier - AuthorityAddress - ResponsibleOfficer - Responsibility - Contact - MailingAddress - Email - Telephone - Extension - Fax - PreferredContact - Logo - URL - Binary - IssueDate - Display - Stylesheet - Seal - OtherSeal - ElectionDistricts - Accepted - Message - Area - OfficialStatusDetail - OfficialStatus - StatusDate - IdValue - Association - PollAssociation - PollingPlace - PhysicalLocation - PollingStation - Map - PostalLocation - ElectronicLocation - OtherLocation - TimeAvailable - Start - End - BallotName - ResultsReported - Status - ChannelID - Notes - Facilities - FacilityService - FacilityAccess
    SCORE: 5.5 out of 10.0
    HINT: representation terms use indicates the nature of the value carried by a leaf node in the structure. The terms qualify the name of the node and its content.
    Representation terms are domain applicable and can be tailored accordingly. The default list of terms provided for guidance here are for a name that contains: 'Amount', 'Count', 'BinaryObject', 'Graphic', 'Picture', 'Sound', 'Video', 'Code', 'Category', 'Currency', 'EMail', 'DateTime', 'Date', 'Time', 'Indicator', 'Format', 'Length', 'Height', 'Width', 'Level', 'Measure', 'Mode', 'Method', 'Numeric', 'Number', 'Price', 'State', 'Status', 'Rank', 'Flag', 'Frequency', 'Format', 'Size', 'Unit', 'Value', 'Version', 'Rate', 'Required', 'Percent', 'Quantity', 'Qty', 'Description', 'Comment', 'Reason', 'Location', 'Instructions', 'Text', 'Title', 'Type', 'Year', 'Month', 'Day', 'Name', 'URI', 'URL', or 'URN';
    or that ends with 'Days', 'Hours', 'Minutes', 'ID', 'Id', or 'Identifier'
    A domain may use specialized terms that are not in the overall list shown here.

ISSUES AND WARNINGS:

  • Caution - use of setLimit() can cause interoperability issues.
    • xal:SubThoroughfare
  • Caution - string item has length restriction - may be interoperability issue
    • Extension
  • Warning - code, type or flag items with no allowed values restriction
    • AuthorityAddress/@AddressIDType
    • AuthorityAddress/@LanguageCode
    • xal:AddressLine/@Type
    • xal:NameElement/@NameCode
    • xal:NameElement/@NameCodeType
    • xal:Thoroughfare/@Type
    • xal:Thoroughfare/@TypeCode
    • xal:SubThoroughfare/@Type
    • xal:SubThoroughfare/@TypeCode
    • xal:Premises/@TypeCode
    • xal:SubPremises/@TypeCode
    • xal:RuralDelivery/@Type
    • xal:PostOffice/@Type
    • Name/@Type
    • Name/@PersonIDType
    • Name/@DataQualityType
    • Name/@LanguageCode
    • MailingAddress/@AddressIDType
    • MailingAddress/@LanguageCode
    • Logo/@ItemType
    • Stylesheet/@Type
    • ts:Reference/@Type
    • ts:RetrievalMethod/@Type
    • ts:Object/@MimeType
    • ts:TSTXMLInfoReference/@Type
    • OtherSeal/@Type
    • Message/@Type
    • Area/@Type
    • PollingDivisionBoundary/@languageCode
    • StreetSegmentMember/@AddressIDType
    • StreetSegmentMember/@LanguageCode
    • BoundaryMarkerGeo/@MeridianCodeType
    • BoundaryMarkerGeo/@DatumCodeType
    • BoundaryMarkerGeo/@ProjectionCodeType
    • Address/@AddressIDType
    • Address/@LanguageCode
    • Map/@ItemType
    • PostalLocation/@AddressIDType
    • PostalLocation/@LanguageCode
    • Status/@Type
    • PollingPlaceGeoLocation/@MeridianCodeType
    • PollingPlaceGeoLocation/@DatumCodeType
    • PollingPlaceGeoLocation/@ProjectionCodeType
    • /AdditionalValidation/Type
    • /AuthorityAddress/xal:PostCode
    • - attribute: @DataQualityType
    • /MailingAddress/xal:PostCode
    • - attribute: @DataQualityType
    • /StreetSegmentMember/xal:PostCode
    • - attribute: @DataQualityType
    • /Address/xal:PostCode
    • - attribute: @DataQualityType
    • /PostalLocation/xal:PostCode
    • - attribute: @DataQualityType
  • Caution - use of fixed value attributes can cause interoperability issues.
    • Id
    • SchemaVersion
    • xlink:label
    • DataQualityType

EXTERNALS:

   Namespace URL


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]