[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: The value of re-use and interchange
In our discussion we've talked a lot about re-use and interchange. I want to make sure we're being realistic about what the relative value is of different types of re-use and interchange because I think some of the values are being or may be overstated or overvalued. But that could just be my jaded view of the world. As background: I've been working with SGML and XML for almost 20 years now in the context of industrial scale technical documentation authoring, management, production, and delivery. For all of that time my primary focus has been on satisfying the following requirements: - Enable the creation, management, and delivery of large systems of interlinked documents - Enable interchange of structured content horizontally across organizations and enterprises and vertically from the past to the future within a singe organization. - Enable optimization of XML applications through enabling controlled local specialization to meet both local and global business requirements with the lowest overall cost (both implementation cost and opportunity cost). In this time I've been exposed to, worked with or been involved with a number of industry groups trying to promote interchange of documents between suppliers and customers, among government entities, and so on. I've worked with companies trying to create bodies of modular, re-usable content in order to reduce authoring costs and reduce time-to-market for product documentation. I've worked with enterprises for whom link management is a key business driver. I've worked with enterprises for which localization of content is a key business driver (which is my current professional focus in the context of using XSL-FO to render documents in most of the world's modern national languages). Out this experience I've learned: - It has been almost impossible to realize any great benefit from cross-enterprise interchange of content. I think this is for several reasons: 1. Early SGML systems were frightfully expensive and the infrastructure was not sufficiently mature. This has changed to some degree, but not entirely (for example, the Arbortext tools, which I consider to be essentially required for any large-scale, productive, technical documentation system, are priced to reflect that value). 2. The SGML applications were not particularly well designed and tended to add unnecessary impedence to the information creation and management tasks. Unfortunately, I haven't seen much improvement here. 3. The nature of any wide-scope application is that it will be suboptimal for most local requirements and provide few mechanisms for local, controlled, specialization. This is just a fact. The only hope at alleviating this is a controlled specialization mechanism such as that in DITA. 4. There are always myriad practical details that tend to make it harder than it originally seemed. This is also unavoidable. 5. The rate of technology change has increased so quickly that by the time you define and deploy a system that truly enables interchange the world has changed underneath you, at best eroding the value of the system, at worst making it obsolete or irrelevant. At the same, enterprises have, for good or ill, shifted toward a much narrower near-term focus, making it harder to fund and justify long-range projects that cannot be immediately justified by cost savings. Certainly interchange has been made to work in some cases but I think that the business value realized has been much lower than was originally promised or hoped. There's still a significant start up cost in time and money that is hard to get over. - Doing content re-use is much harder than people usually think, for a number of reasons. One reason is that it makes authoring harder. Another is that it makes link management and versioning harder because of the dependencies. It also makes quality assurance harder. In essessence, the human issues of configuration management and communication among authors add significant cost and may (but not always) offset the value of re-use. Whether there is truly a benefit to modular re-use for a given business depends on many variables, including the nature of the things documented, the requirement for correctness (is this a mobile phone with a 2-year product life cycle or a commercial aircraft with a 50-year product life cycle and critical safety requirements?), the sophistication of the authors involved, and so on. For example, consider the effort involved in creating a DITA map over a repository of several thousand content objects. Even with sophisticated authoring tools it's a significant conceptual challenge that many technical writers are simply not prepared for or willing to take on. This means that you likely have to hire, train, support, and retain highly skilled information developers to create and maintain your maps. Good for the skilled writers but an additional cost to the organization when you could have had less direct re-use but less skilled (but equally effective) writers. That is, sometimes the less sophisticated, brute force approach to document creation and management is the better business decision even though it's less elegant technologically. - Link management in the context of modular information systems is a challenge that requires significant investment in information management infrastructure. There are, to date, no commercial tools that, in my opinion, satisfy this requirement, especially in the context of long product life cycles with lots of revision. I think I know *how* to solve this problem, and we (ISOGEN) have published our ideas and urged anyone who wants to to implement them, but to date nobody has (we did but for business reasons have been unable to market that code). - Content interchange *within* enterprises is generally much more valuable than interchange *across* enterprises. That is, I can get a lot of value interchanging content between the product group, the training group, and sales group, but much less value interchanging between myself and my print engine supplier, for the simple reasons that the cost of enabling that cross-enterprise interchange is high, the actual volume of data interchanged is low, and the interchanged data will likely need local re-authoring anyway. In practice it's easier to do interchange via transformation than by standardization across enterprise boundaries except where volumes are high or there's some other non-typical requirement that demands standardization. Implementing transforms is relatively cheap relative to the cost of defining, implementing, and enforcing interchange standards. - The cost of implementing production (rendering) systems is much much lower than the cost of creating and maintaining the data. That is, the cost of authoring and maintenance is high, and the value of having well structured data is high, but the cost of implementing transforms to do stuff with that data is low. Now that we have technologies like XSLT, XSL-FO, SAX, and DOM and no shortage of people who can apply them well, transforms and the like are essentially commodities, no different from any other code you might have written. They also have relativley low long-term maintenance costs--we can reasonably expect that XSLT skills will be widely available 10 or 20 years from now. Therefore, the value of being able to re-use existing code is relatively low, especially compared to the overall cost of a total information support system. So while using the DITA-provided XSLTs is useful for getting something working quickly, you'd be much less likely to depend on them for a production system because that system probably requires lots of things that the DITA-provided code simply wouldn't have, from conforming to your local engineering practices to providing business-specific functionality. So while code re-use is always valuable, I find it's value relative to other values and costs to generally be non-compelling, simply because it tends to have diminishing value as a given system becomes more sophisticated and more specialized. The place where I find code re-use most valuable is in the implementation of core generic semantics, like link address resolution, transclusion resolution, and so on, all of which are (or can be) completely generic and independent of specific content semantics. For example, I've only ever written XPath resolution in XSLT once, but I've written templates to format chapters dozens of different ways. If the question is "create an enterprise-specific document type or re-use the production tools for standard doctype X" it's not even an issue--the enterprise-specific document type wins every time because satisfying the enterprise's information capture and representation requirements are almost always the most important thing (and always are if the time scope of the system is anything more than a year or two). So to summarize, I think that it is easy to oversell and overvalue the following: - cross-enterprise standards-based interchange of content (that is, interchange in terms of a standardized document type, rather than by transformation). - wide-scope re-use of content modules. - re-use of existing code, especially for rendition It is this experience and analysis that causes me to focus much more on the core infrastructure aspects of something like DITA, i.e., the specialization mechanism and the general shape of the base types, than on code of the moment or issues of cross-enterprise interchange. Cheers, Eliot -- W. Eliot Kimber Professional Services Innodata Isogen 9390 Research Blvd, #410 Austin, TX 78759 (512) 372-8122 eliot@innodata-isogen.com www.innodata-isogen.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]