[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [ubl-ndrsc] Containership issue with current LCSC samples
Point taken: performance is always an issue for someone. Just so everyone understands though -- the specific proposals we're discussing: (a) extra container elements for each element with numOccurs "large", and (b) extra container elements to group elements (that are somehow not already grouped as an ABIE, yet for performance reasons should be shoved down one level in the tree representation) ... neither of these materially affect document size. The issue of session timeouts is driven by document size and processing time. If anything, the extra container elements will make documents slightly _bigger_ -- but not enough to worry about. So we're left with processing time as the only issue. How much processing time does it take to process a 60MB PO? What sort of difference do (a) and (b) make on important scenarios for most customers? I believe that with the current structure, most users will be able to do all their processing, efficiently, with reasonably simple code. For customers that do high volume -- either lots of documents per tick, or very large documents, or both, optimizations will be necessary. What kind of user does high volume, large documents? The _large_ company. Does the little-guy have any problem here? No. Now, how does it go, we should strive to make the typical case easy and the difficult case possible. Will it be _possible_ for the big guy to process UBL? Worst case, the big guy can do a O(N) single-pass over a document, converting it into whatever structure he needs to optimize his hot access paths. He might transform a UBL doc into another XML doc and then work on that one. Alternatively he might build indexed in-memory structures, or write things off to a modern RDBMS. Whatever. UBL is an interoperability format -- not a database format. No matter how we structure it, _someone's_ processing will be inefficient. A central lesson of 40 years of database practice is that it's a bad idea to contort your data model with too many processing concerns up-front because you'll just get it wrong for half the people anyway. That's why dbms's have indexing, and statistics-based query optimizers -- so that you can come in _after_ the fact and without changing your _design_ (analogous to our UBL schema) you can make new applications efficient. So if the argument isn't about better understandability then I don't buy it. Show me the performance problem, and I'll show you the solution. The NDR around "containers" as it stands makes the typical case easy and the difficult case possible. It seems to me, out of balance, to spend more resources on this non-issue now, when much more urgent work needs attention. -----Original Message----- From: Jon Bosak [mailto:Jon.Bosak@sun.com] Sent: Tuesday, March 04, 2003 11:24 PM To: Burcham, Bill Cc: agregory@aeon-llc.com; ubl-ndrsc@lists.oasis-open.org Subject: Re: [ubl-ndrsc] Containership issue with current LCSC samples | Bentley says: premature optimization is the root of all evil. I don't | think anyone's going to be able to show a real world UBL document | that's gonna have any real performance problem for a real world user. | Prove me wrong ;-) Without venturing an opinion on the design issue here, I must sadly inform you that XML performance in B2B contexts is becoming an issue. RosettaNet users are reporting single file sizes upwards of 60 MB for real-world POs, and some RN users are concerned about hitting a wall with automated timeouts for acknowledgments. This problem can of course be ameliorated by adjusting the timeouts and increasing net bandwidth, but we're starting to catch glimpses of a limitation in the size of documents that can be processed effectively without hardware upgrades -- not an easy thing to get budget for these days. Not a comment on the debate, just a data point. Jon ---------------------------------------------------------------- To subscribe or unsubscribe from this elist use the subscription manager: <http://lists.oasis-open.org/ob/adm.pl>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]