[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [chairs] need your comments on DocMgmt system requirements
A better system than CVS for the OASIS environment (and I'm a big CVS fan) may be subversion, which builds upon WebDAV - thus subversion repositories are browseable (and linkable) with a standard HTTP client (ie a web browser). http://subversion.tigris.org/ -Gabe > -----Original Message----- > From: Karl F. Best [mailto:karl.best@oasis-open.org] > Sent: Wednesday, February 18, 2004 7:05 AM > To: Norman Walsh > Cc: Chairs OASIS; Jeff Lomas > Subject: Re: [chairs] need your comments on DocMgmt system > requirements > > > Norman Walsh wrote: > > / "Karl F. Best" <karl.best@oasis-open.org> was heard to say: > > | I've put together a draft functional requirements > document for this > > | doc mgmt system and would like to get your feedback. It is very > > | important that we have the requirements correct and > complete before we > > | start development of the project -- many of you are > developers so I'm > > | sure that you understand the importance of this. > > > > High level comments: > > > > - I don't think these requirements adequately address the > distinction > > between a development system (where TCs actively revise documents, > > schemas, etc.) and a publication system (where TCs post working > > drafts, standards, and other "finished" work products). > > > > Is the proposal to develop one or the other, or both. If > it's one or > > the other, then I think some of these requirements are completely > > inappropriate. If it's both, I think it might be useful > to specify > > them separately. (And whether you imagine having resources to do > > them in sequence, or at the same time?) > > I've previously thought of having a two-phase system, the > first of which > would provide a "sandbox" for the TC members to collaborate in > developing a document. Then once the doc reached a certain stage it > would then go into a more controlled environment with e.g. versioning > and edited only by the TC. I've gotten the impression that most TCs > would only use the second phase, but I could be wrong. > > Chairs: would you prefer having both of these phases built > into the doc > mgmt system (open collaboration, followed by more rigourous > control)? or > would you only use the second? > > > - There are several places where the requirements seem to be > > self-contradictory. > > Specifics? This is obviously a draft so needs polishing, so > suggestions > are welcome. > > > - I think meeting all of the requirements listed below will be a > > significant challenge. A more detailed roadmap, showing staged > > progress with realistic time estimates would be very helpful. > > Yeah. That's the next step. But right now I'm just gathering > requirements. I can't very well write a development schedule until I > know what it is that we're trying to build. > > I'd also like suggestions on which parts of this are most > important. I'm > debating whether we should try a phased development approach (i.e. > provide base functionality now then add a more functionality > over time). > Looking through the requirements that I have now, though, I'm > not sure > which ones we could put off until later. > > Chairs: suggestions please. > > > - A number of the features that you describe would seem to > be at least > > partially addressed by open source efforts like G-Forge (an open > > source version of SourceForge). Are you considering a system like > > that, or are you expecting to "roll your own" from scratch. > > I'm intending for us to build on top of an existing system. > That's why I > said "probably CVS". We'd be silly to build something from > scratch when > the engine already exists. We'll build some sort of customized web > interface on top of the engine. Once we have the requirements > we'll know > what it is that we need to build. I'd also like suggestions for the > engine; is CVS the way to go, or do people recommend something else? > > >>OASIS DocMgmt Functional Requirements > >> > >>(17 February 2004) > >> > >>General Description: A repository providing storage/management of > >>files created by TCs, SCs, and other OASIS groups > > > > Technical committees need to be able to store and manage a > collection > > of resources. Principal among these resources are > documents, but it's > > reasonable to consider other, related resources as well, including > > issue lists, archives, news items, and syndicated content. > > The doc mgmt system would store any type of file. Not just specs, but > also the other doc types you mention. > > Would some of these stored objects be links and not files? > > >> o Probably based on CVS > > > > The requirements for a "development tree" are likely to be somewhat > > different than the requirements for a "publishing tree". In > > particular, I would expect published standards to be more-or-less > > immutable, to have persistent URIs, etc. In a development > tree, those > > constraints might be quite stifling. > > > > CVS supports a development system very well. It's not immediately > > clear to me if it supports a publication system equally well. > > I'm certainly not a CVS expert, though I'm aware that it was > built for > development rather than documents. So it may not be ideal for > what we want. > > Does anyone have suggestions for a better engine, better > suited for doc > development and publishing, upon which to build our system? > > >> o A separate area in the repository for each TC/SC/group; both > >> default and definable hierarchy within each TC area > > > > Can you elaborate on what you mean by "both default and definable"? > > What do you have in mind for "default"? > > When we create a new TC we would define hierarchy branches for such > things as e.g. "drafts", "minutes", "contributions" etc. > (TBD). Then the > TC chair could define additional branches as required. We'd > want to keep > the hierarchy as flat as possible to keep the URLs short, and > we'd want > some consistency, but I want to give the TCs some control > over there space. > > >> o All documents are permanently archived (only Admin has delete > >> rights) > > > > In CVS terms, you can delete a document, but you can always recover > > it. In a development tree, it's not uncommon to reorganize some code > > or a document and want to remove modules from the current "head" of > > the development tree. This goes back to my comment before that the > > requirements for publication and development are somewhat different. > > Maybe this is where the "sandbox" (above) comes in. I don't > see the need > of permanently archiving early drafts, but once a doc is checked into > the permanent repository it should be permanent. > > >> o All documents are publicly viewable, downloadable > >> > >> o Repository has a web interface for uploading and tree browsing, > >> searching, and retrieval > >> > >> + Support for all major browsers > >> > >> + Listing of single files includes filename, title, > description, > >> date, creator, and language; listing of packages includes the > >> list of single files in the package > >> > >> + Search by filename, title, date, creator, and language; and > >> full-text search of description and contents. > > > > Does it have other interfaces? Are you describing a > front-end for CVS > > here, or something else? Does it support Web-DAV? > > I would expect that most people would want to use a web > interface, but I > suppose that power users may want to deal more directly with > the engine. > But there's also certain safeguards (permissions, restrictions on > naming, etc.) that may require that we use an interface. I don't know > yet; this may depend on the engine. > > What are the benefits of Web-DAV? (I'm not an expert on this.) > > > I think it would make sense to address searching as its own > top-level > > item. In particular, the description above suggests that every item > > will have a set of metadata that can be searched. Where/when is this > > metadata created? Can I add my own? Is it expressed in an > open format, > > an XML vocabulary or RDF or a topic map, or is it proprietary? How > > does this metadata evolve as documents change in CVS? > > I see the metadata as comprised of the fields listed above. > TBD. I don't > know yet how this would be expressed because we havne't selected an > engine yet. > > How does this matter? Yes, we should use XML on principle, > but I don't > see it as a requirement. > > > As for searching the content, that's clearly going to depend on the > > type of content. What types will the system support? > > Obviously not all content will be searchable. If somebody > uploads a blob > there's not much we'll be able to do with it besides just store it. > > We will store whatever types of files the TCs need to store. > > >>Persistent URLs > >> > >> o At file creation the document is assigned a URL according to the > >> OASIS file naming scheme. The URL will always resolve to > the latest > >> version of the document, regardless of the documents (versioned) > >> filename; a URL will identify a specification throughout > its entire > >> lifetime from working draft to OASIS Standard. Previous > versions of > >> the document will be accessible via a variant of the URL > containing > >> the version number. > > > > This is fine for storing standards but it's in conflict with the use > > of CVS and the reference above to a "definable hierarchy". > > Again, I'm not an expert on what you can and can't do with CVS. > Suggestions welcome. > > > I think this should apply to published standards and work products, > > but I don't think it can practically be applied to a development > > space. > > If we have a "sandbox" phase then we wouldn't expect a persistent URL > for those items. Only once a doc is checked into the permanent > repository would we do this. > > > This suggests that the interface to the published standards space > > might require more constraints. I hope that these constraints can be > > imposed without requiring me to interact with the system > only through > > a web interface. > > As above, power users like yourself may wish to talk directly to the > engine, but there will be some constraints for security and > consistency. > If it is practical to enforce those constraints via both a > web interface > as well as a native interface then we will. But if it's not practical > then we'll have to do everything through a browser. > > >>Multiple file types supported > >> > >> o TCs will store both source (e.g. MSWord or HTML) and > compiled (e.g. > >> PDF) versions of each file; i.e. the repository should > not allow a > >> PDF to be checked in without a matching .doc or .html file > > > > Uhm, what about documents that have a source which is neither a > > proprietary tool or HTML? > > The above is not an exhaustive list. I'm just suggesting that both > source and compiled versions should be in the repository. Any > responsible developer should agree with this philosophy. > > > Imposing the requirement that the system check for classes of > > dependencies between files of different types is going to be tricky, > > especially as the specs evolve. Suppose I rebuild the PDF, > can I check > > it in without checking in a new source document? What if I only > > corrected a formatting bug? If I check in a new source, what happens > > to the PDF? > > Yeah, we'll have to figure this out. How do you do it when > you write code? > > > I think a lot more detail is required in this part of the > > requirements. > > That's why I'm asking for input. > > >> o HTML files may include graphics which will be stored > with the file > >> (use relative URLs?) > > > > What about other cross-document links? What about XML files > that refer > > to both HTML and PDF presentations? What about document trees that > > consist of multiple chapters in a hierarchy with a common set of > > figures? > > > > More detail, please. > > More input, please. > > >> o use MIME types > >> > >>Packages > >> > >> o A specification may be composed of multiple documents. The entire > >> package may be uploaded or downloaded in a single operation. > >> Individual documents in the package may also be uploaded or > >> downloaded. > > > > I don't understand what you mean here. Are you suggesting > that I might > > upload a package (as a ZIP file? as a MIME multi-part > related stream?) > > and then several days later upload a new version of one component in > > that package. Having done so, what "version" does the package have? > > Can I still download the original? Can I download the > revised version? > > Probably the package will just be an HTML file with links to > all of the > components. In that case the package is updated by editing > the links in > the package file. Each of the components are maintained by > editing them > individually. Each component, as well as the package file, could have > its own version number or date, but the entire set would collectively > have to be versioned. Would this work? > > >> o Support for chapters or parts of a multi-part document > (with links > >> between parts); a package could have a ToC with links to the > >> individual files > > > > I think any attempt to describe the size and shape of a > package ("it will have > > a ToC and chapters" or "it will have a starting page and > parts") will be > > problematic. Best just to accept that a multi-part document > is a directed > > graph (a web). > > Would my description (above) of a package work for this? The TC can > decide how it wants to structure the multi-part spec. > > >> o Support for modular DTDs (e.g. DocBook) > > > > What does this requirement mean? Do you also mean modular W3C XML > > Schemas and RELAX NG grammars? Does this requirement differ from the > > preceding one in a particular way? > > Pretty much the same, I think, but I'd be happy to hear other > requirements not met by the above. > > >> o The entire package is addressable via a single URL, as are the > >> individual documents. The package URL will link to an HTML page > >> listing the package contents. > > > > Is that an HTML page constructed by the author of the package, or > > automatically from the content of the package? If it's the latter, > > what constraints, if any, does that impose on the contents of the > > package? > > > > > >>Security > >> > >> o Check-in/out based on Kavi user authentication; different > >> permissions for public, TC members, chair/secretary, etc. > >> > >> o TC members have ??? rights (TBD) > >> > >> o TC Chair and Secretary have create, edit rights for folders and > >> checkin/out rights for documents in their respective TC area > >> > >> o Admin has admin rights (create, checkin/out, delete of > all folders and files) > >> > >> o Public has read rights for all documents > > > > > > How does "admin" differ from chair/secretary? > > "Admin" is the OASIS staff administrator of the dc mgmt system. > > >>Kavi integration > >> > >> o Kavi user acct/pswd used for authentication in doc mgmt system > >> > >> o Notification to the Kavi group when a document is > uploaded (same as > >> current Kavi notification) > >> > >> o The current Kavi doc repository is disabled; links > within Kavi will > >> go to this doc mgmt system instead (i.e. Kavi doc repository is > >> hidden, this one drops in to replace it). > >> > >> o Docs currently in the Kavi repository will continue to be > >> addressable and viewable by their Kavi URL (allow for > migration over > >> time) > > > > This requirement and the previous requirement seem to be in > conflict. > > Can you explain how "the links within Kavi will go to this doc mgmt > > system instead" supports the goal that "the Kavi repository will > > continue to be addressable and viewable by their Kavi URL (allow for > > migration over time)"? > > Right now when you're in Kavi you can click on a link for "doc > repository" and it will take you to that page in Kavi. I'd > like it to go > to the new doc mgmt system instead. But we should allow > current docs in > the Kavi repository to stay where they're at until the TC > wants to move > them, so these docs need to remain addressable by the current URLs. > We'll have to keep the Kavi search/browse accessible, but the default > would go to the new doc mgmt system. > > >> o When new Kavi group (TC/SC) is created, a doc mgmt area for that > >> group and default folders are automatically created > > > > This goes back to the question of defaults before. What hierarchy do > > you have in mind, and what are your motivations for creating it? I > > think it'll be easier in the long run to simply create an empty > > hierarchy and let the TCs populate it. > > > > If you have in mind that minutes should go in /minutes and press > > clippings should go in /press, etc., then I think a detailed > > description of the default hierarchy is required. > > See above. Still TBD, but we need both consistency as well as > flexibility. > > >>File naming (automation of this done in a later phase; just > do this manually at first?) > >> > >> o Naming and versioning of documents follows OASIS file > naming scheme > >> > >> o When a new document is created it will be named according to the > >> scheme; automated helps to create/assign a name > > > > This seems to duplicate the requirements expressed under "Persistent > > URLs". Is it intended to be different? I believe my comments there > > apply here as well. > > Th eintent is to provide (eventually, maybe a bit later) a > GUI to help > name new files conformant with the OASIS doc naming scheme. I > envision > pull-downs to select each of the components of the name. But > this will > probably be later; the file creator would have to manually > name the file > for now. > > >>Localizable interface, with localization to occur in a later phase > >> > >>Later phase: Count/traffic report of downloads (how many people have > >>downloaded a particular doc?) > > > > > > Other later phase items? > > > > - Issue tracking? > > Sounds like a separate tool. Yes, we need this. Suggestions? > > > - automatic generation of PDF/HTML from source formats? > > Yeah, we could add this, but is there a need? Can't people do > this already? > > > - validation? > > Ditto. Can't you do this already? > > But, yes, I see the utility of having validation on checkin, and > publishing, as part of a doc mgmt system. > > > - interactive forms (e.g., the ability to support an > interface that > > asks a number of questions and then builds an appropriate schema > > customization layer)? > > That's the sort of interface I had in mind for the file > naming (above). > But I see this as a separate tool for later. > > > - Syndication of announcements > > - An informal "journal" space (or blog, if you will) for > TC members > > to outline their thoughts and ideas? > > Both of those are separate tools. Not sure how those would be > part of a > doc mgmt system. > > Thanks for the feedback. Much appreciated. > > -Karl > > > > > > > > > > Be seeing you, > > norm > > > > P.S. I'm happy to report that your requirements document > can be nicely > > presented in an open format (plain text, in this case) instead of a > > proprietary format. I hope that its greater accessibility in this > > format (and the fact that it's six times smaller) can be used to > > demonstrate once again the value of open standards. > > > > (For even more thoughts on this topic, see > > http://www.gnu.org/philosophy/no-word-attachments.html) > > > > > -- > ================================================================= > Karl F. Best > Vice President, OASIS > office +1 978.667.5115 x206 mobile +1 978.761.1648 > karl.best@oasis-open.org http://www.oasis-open.org >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]