[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [was] Schema Started
Ingo, Thanks and it is great to have you onboard ! I don't claim to know enough about schema or DTD to be authoritatve here, but I do know Schema was easy enough for me to pick up and start writing meaningful stuff after an hour of reading a book. What I also know is we need to choose one OR the other to move forward in unison. The reason schema would get my vote would be that I think the types will be a mix of strings and other things such as dates, booleans and URI's etc. Plus we have already started the Thesaurus / Classification in it. By designing with these from day one I personally think its easier, but I am obviously happy to go with the majority. As so far its only you, me, Rogan and Andy writing any document (and the rest reading hopefully) I think the four of us can work it out easily and move forward quickly. Lets do that after tomorows meeting / offline. You kinda started in a different direction from what I was doing so I won't continue until after tomorows meeting if appropriate but some thoughts..... I totally agree in the strong structure. It is THE most important thing. Natural language - that makes perfect sense. Your experience beats my niavity anyday ! Characteristics - if the intention is to group like elements (which I think is a great idea as well), I am not sure I understand the difference or maybe the need for seperate groupings of security chacteristics and basic. To me the easiest model would be for two groupings with sub-groups if appropriate. WASDescription Reference Remedy TestCase The Descriptive / Reference should contain all of the elements we have been discussing such as fix information, references to other databases, the thesaurus, ID etc and the Test contains the executeable signature. The TestCase could import / include the exploit. If this simple model works then can I propose we work on the Description first. This will be more managable to tackle one at a time, the significantly easier of the two to do and will allow us to focus on one problem at a time. I see no reason why this can't be finished this week. On that note I met Andy Jaquith for a fine beer last night and we chatted some about companies wanting to use vuln informaiton for statistical analysis. Things like Of the vulns found, "how long have patches been available ?", "What is the latency between advisory and exploit?", what is the most common category of issues found?" etc. I think this is an interesting thought process to help think about the data in the Description section as well as the information needed to run a production vuln database such as provider info, versioning info, author credits, copyright, licensing etc Seem OK ? If so lets chat after the meeting tomorow about DTD and Schema and get WASDescription bit done ASAP. ----- Original Message ----- From: "Ingo Struck" <ingo@ingostruck.de> To: <was@lists.oasis-open.org> Cc: <mark@curphey.com> Sent: Wednesday, September 10, 2003 7:54 AM Subject: Re: [was] Schema Started -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Mark, hi all... First (again) a remark about document formats used: *please*, please use interoperable formats for your documents and no word-docs. I am getting kind of tired to "antiword" documents on a regular basis. Even though I am personally not so very easy with DocBook, it is one of the more interoperable formats recommended by OASIS and there are even some templates around (cf. http://www.oasis-open.org/spectools/) and so I think we should stick to that. That said, here you go with some thoughts regarding the implementation of some of the requirements: i18n - I saw that you introduced a "naturalLanguage" element in the spec. Based on experience with a multilingual thesaurus project I would say that this doesnt make much sense. For a working, real multilang support, you need to do the following things: - - make each "natural" text element repetitive - - qualify each occurence with a script (e.g. "latn", "hebr") - - qualify each occurence with a natural language code, preferably with ISO 639-2 (three letter lang code) The "script" qualifier is optional and supportive information for the renderer and to support different transcriptions. Generally each "natural" text element should use Unicode only and should be UTF-8 encoded. The mandatory overall encoding thus should be UTF-8. > The basic question I have (for this mail anyway) is what do we want the > overall structure to look like ? Jeff Williams sent an email out a while > back with a proposal for 5 main sections. > > 1 - basic characteristics of the vulnerability > 2 - security characteristics of the vulnerability > 3 - characteristics related to finding the vulnerability > 4 - characteristics related to exploiting the vulnerability > 5 - characteristics related to remedying the vulnerability I think that the usage of differenc "characteristics" makes much sense after all. - From the outlines what needs to be stored there, I derived a generic model for a single characteristic. This allows for uniform description and simplified search criteria upon retrieval. Based upon that I would propose the attached overall structure of a WAS-core entry: A WASDescription consists of generic information that can be indexed and used to search the database directly as well as of a set of characteristics describing the problem. I modelled that into a proposal as a DTD. From that I generated an XML schema using a patched version of dtd2xs 1.60 (cf. http://puvogel.informatik.med.uni-giessen.de/lumrix/) If you compare them, you'll see that the DTD is much better to read. Since a former mail of mine did not yet reach this list, I would like to repeat a part of that: === snip === Lets try to design the overall structure of the schema using a DTD and then transform it to an XML schema later on. This would have the following advantages: - - the DTD is not an optimal, but much more compact description of the overall structure (e.g. the cardinality information consists only of one symbol rather than of a lengty minOccurs, maxOccurs; lists are better to read etc.) so that the description is better to read and easier to modify. I think we will find that the "advanced" features of XML schema are seldom, if at all, used anyway (only for data types which will be mainly strings) - - we can focus first on structure (rough outline, *what* is needed and fine-tuning of cardinality) and then on detailled typisation later on - - there is a *working* application online, where anybody could play around with the latest schema using an editor to create sample entries. These sample entries are made publicly available and can be reviewed by the rest of the TC (and the rest of the world), so the latest schema proposal could be checked easily for practicability. This application is freely available (SF CVS), based upon DTDescription and could be adapted within less than an hour to any new DTD, but it would cost weeks (i.e. one or two man days) to plug an XML schema parser to it. - - turning the "final" DTD into an XML scheme is not much pain (it just means to bloat it to about 300% with tag-style non-information, which could be performed by a DTD2scheme converter, a very simple script or even a handful of vi commands) > XML schema can be daunting at first. To be honest, XML schema remains daunting, and a slick BNF notation or even ASN.1 would be of much more use, but within this scope I accept that we have to create a schema. ;o) === snap === Mark answered to that: "I think that this maybe counter intuitive as to design the extensibility etc I think it would be easier to use a schema (like most things I am a total novice but based on a furious weeks reading) may make more sense. To use object types and to define object types for reuse etc would seem like design goals from day one?" I would rather say that we need a strong structure much more than strong typisation; most of the types we need here are/will be "String" anyway. Having a full-fledged working online application where the current structure could be tested for practicability seems to me a very strong argument too. Kind regards Ingo -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE/XxDjhQivkhmqPSQRAsOpAKCPXAfYj2uiCFLXVPvwgf+/2R0sAwCdE7A3 yMDUVFFu4B+32btO+7kGdZw= =XUsR -----END PGP SIGNATURE-----
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]