[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [egov] Re: Need advice regarding XML performance issues
<Quote> This document displays a performance comparison between the most common XML processing techniques according to the benchmark package published in IBM Developerworks in September 2001. </Quote> Hmmm...over 2 years old... Jouko would you have an updated version available that reflects today's current environments? Or are you confident that the results still stand today? Joe Jouko Salonen wrote: > > Dear Mr. Hughes, Duane > > As Duane says there are different aspects of "performance" in XML parsing and processing. > You might want to look at the attached XML Performance Test report and the stress curve picture. > > The report and further information can be found at: > > http://www.x-fetch.com/Component_WhitePapers_PDF/X-Fetch_Performer22_Benchmark.pdf > > Best regards > Jouko Salonen > www.reublica.fi > > -----Original Message----- > From: Duane Nickull [mailto:dnickull@adobe.com] > Sent: 5. tammikuuta 2004 21:00 > To: John.Borras@e-Envoy.gsi.gov.uk > Cc: egov@lists.oasis-open.org; mwhughes@sandproof.org > Subject: Re: [egov] Re: Need advice regarding XML performance issues > > Michael: > > Performance is a very loaded term. We have had huge debates on this > going back to 1996 on the XML-dev list. I will try to recall some of > the points we agreed on. > > 1. Performance is affected largely by platform, programming language and > physical memory (*both heap and stack) > 2. Sax is an Event based model. I have a PPT slide that explains the > concept of SAX very clearly at > http://www.nickull.net/presentations.html. (download the one entitled > Washington - Day Three). Sax works by reading in an XML document as a > one dimensional stream of bytes. When an enough bytes are read that an > "event" is recognized, an event notice is dispatched up the stack. The > event notices are simple text messages that look something like this > > StartElement=["foo"]; > > The above event is the parsers way of telling the parent that > instantiated it that is has encountered a start element named "foo". > Once the event has been dispatched. No residual memory of the event is > kept. This makes SAX a preferable methodology for parsing when there > are strict memory requirement. > > Since XML does not contain any semantics, a parser is simply a reader. > Nothing is done with the XML except reading it, checking it for errors > and resolving entities (three mandatory items) and a fourth optional > item of validating it against a DTD or XMl Schema. The latter also > slows down parsing. > > The Java SAX implementation (Xerces), accordingly has four main handlers > (entity resolver, error handler, Validation handler and event handler. > It is up to the programmer to capture all the events that get passed up > and do something meaningful with them. *** This is the place where a lot > of performance can be gained or lost!!! Since just about all programs > that consume XML documents will eventually do something with them, the > skill of the programmer writing the handler code greatly affects things > like memory, speed etc. If you use a language like Java with automatic > garbage collection, your memory options are managed for you however you > can still tune it further. If you work in a language like C or C++ > (ANSI), the skill of the programmer is going to affect your systems > performance. > > 3. If one requires to keep a model of the XML document and run a series > of programmatic tests against it, you will likely use the DOM. DOM > (Document Object Model) works by accepting the events from the SAX > handler (* although use of sax is not mandatory) and building an in > memory representation of the original XML document. Tests and queries > can then be run against the DOM tree to test for certain conditions, > etc. Performance is greatly affected here by what kinds of tests you > will run against your XML tree. This is a point of contention for those > who advocate XML automatically written out from a model since not all > object models will result in XML that is efficient to query. IMHO - a > balance has to be struck between the modellers requirements and the > programmers/system administrators. Anyways, XML like this: > > <root> > <tag one/> > <tag two/> > <tag three/> > </root> > > will be easier on processor speed that this: > > <root> > <tag one> > <tag two> > <tag three/> > </tag two> > <tag two> > <tag three/> > ... > > if you are iterating through a deep tree looking for matches. > > Summary: > > I have studied the performance issues for a lot of years and will attest > that it is an extremely complex issue and the truths about it change > almost monthly as parsers are upgraded, new chipsets come out, new API's > to the O/S are used, newer versions of garbage collection (Java, C#) are > invented etc. To be up to date is almost impossible however there are a > simple set of rules that can probably get you 85% of the way there. > > If you are eager to delve into this subject more thoroughly, please > contact me offline and I can provide you with some links etc. > > Cheers > > Duane Nickull > > John.Borras@e-Envoy.gsi.gov.uk wrote: > > > > > TC Members > > > > Can anyone provide any pointers or advice to Michael please? > > > > If there are any commercial sensitivities about your advice then I'll > > leave you to negotiate directly with him for providing that advice. > > > > John > > > > > > "Mike Hughes" <mwhughes@sandproof.org> > > > > 31/12/2003 21:12 > > > > To > > <john.borras@e-envoy.gsi.gov.uk> > > cc > > > > Subject > > Need advice regarding XML performance issues > > > > > > > > > > > > > > > > > > Dear Mr. Borras, > > > > I am researching performance issues related to XML and need to speak > > with an expert. I understand that you Chairman of the OASIS > > e-Government Technical Committee.. > > > > I would appreciate it if you could reply to this email with the names > > of people who could provide related input, particular with regard to > > specific standards and conditions that affect performance very > > adversely. Also, I need to understand what measures, commercial or > > standards-related, are being taken to resolve such performance problems. > > > > Thank you for any assistance you can provide. > > > > Sincerely, > > > > Michael W. Hughes > > Amplicast > > Erie, CO > > USA > > > > > > PLEASE NOTE: THE ABOVE MESSAGE WAS RECEIVED FROM THE INTERNET. > > > > On entering the GSI, this email was scanned for viruses by the > > Government Secure Intranet (GSI) virus scanning service supplied > > exclusively by Cable & Wireless in partnership with MessageLabs. > > > > GSI users see > > http://www.gsi.gov.uk/main/notices/information/gsi-003-2002.pdf for > > further details. In case of problems, please call your organisational > > IT helpdesk. > > > > -- > Senior Standards Strategist > Adobe Systems, Inc. > http://www.adobe.com > > To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/egov/members/leave_workgroup.php. > > ------------------------------------------------------------------------ > Name: XML_PERFORMANCE_BENCHMARK.pdf > XML_PERFORMANCE_BENCHMARK.pdf Type: Acrobat (application/pdf) > Encoding: base64 > Description: XML_PERFORMANCE_BENCHMARK.pdf > > Name: stresstest.jpg > stresstest.jpg Type: JPEG Image (image/jpeg) > Encoding: base64 > Description: stresstest.jpg > > ------------------------------------------------------------------------ > To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/egov/members/leave_workgroup.php.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]