OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

egov message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [egov] Re: Need advice regarding XML performance issues


<Quote>
This document displays a performance comparison between the most common
XML processing techniques according to the benchmark package published
in IBM Developerworks in September 2001.
</Quote>

Hmmm...over 2 years old...

Jouko would you have an updated version available that reflects today's
current environments? Or are you confident that the results still stand
today?

Joe

Jouko Salonen wrote:
> 
> Dear Mr. Hughes, Duane
> 
> As Duane says there are different aspects of "performance" in XML parsing and processing.
> You might want to look at the attached XML Performance Test report and the stress curve picture.
> 
> The report and further information can be found at:
> 
> http://www.x-fetch.com/Component_WhitePapers_PDF/X-Fetch_Performer22_Benchmark.pdf
> 
> Best regards
> Jouko Salonen
> www.reublica.fi
> 
> -----Original Message-----
> From: Duane Nickull [mailto:dnickull@adobe.com]
> Sent: 5. tammikuuta 2004 21:00
> To: John.Borras@e-Envoy.gsi.gov.uk
> Cc: egov@lists.oasis-open.org; mwhughes@sandproof.org
> Subject: Re: [egov] Re: Need advice regarding XML performance issues
> 
> Michael:
> 
> Performance is a very loaded term.  We have had huge debates on this
> going back to 1996 on  the XML-dev list.  I will try to recall some of
> the points we agreed on.
> 
> 1. Performance is affected largely by platform, programming language and
> physical memory (*both heap and stack)
> 2. Sax is an Event based model.  I have a PPT slide that explains the
> concept of SAX very clearly at
> http://www.nickull.net/presentations.html. (download the one entitled
> Washington - Day Three).  Sax works by reading in an XML document as a
> one dimensional stream of bytes.  When an enough bytes are read that an
> "event" is recognized, an event notice is dispatched up the stack.  The
> event notices are simple text messages that look something like this
> 
> StartElement=["foo"];
> 
> The above event is the parsers way of telling the parent that
> instantiated it that is has encountered a start element named "foo".
>  Once the event has been dispatched.  No residual memory of the event is
> kept.  This makes SAX a preferable methodology for parsing when there
> are strict memory requirement.
> 
> Since XML does not contain any semantics, a parser is simply a reader.
>  Nothing is done with the XML except reading it, checking it for errors
> and resolving entities (three mandatory items) and a fourth optional
> item of validating it against a DTD or XMl Schema.  The latter also
> slows down parsing.
> 
> The Java SAX implementation (Xerces), accordingly has four main handlers
> (entity resolver, error handler, Validation handler and event handler.
>  It is up to the programmer to capture all the events that get passed up
> and do something meaningful with them. *** This is the place where a lot
> of performance can be gained or lost!!!  Since just about all programs
> that consume XML documents will eventually do something with them, the
> skill of the programmer writing the handler code greatly affects things
> like memory, speed etc.  If you use a language like Java with automatic
> garbage collection, your memory options are managed for you however you
> can still tune it further.  If you work in a language like C or C++
> (ANSI), the skill of the programmer is going to affect your systems
> performance.
> 
> 3. If one requires to keep a model of the XML document and run a series
> of programmatic tests against it, you will likely use the DOM.  DOM
> (Document Object Model) works by accepting the events from the SAX
> handler (* although use of sax is not mandatory) and building an in
> memory representation of the original XML document.  Tests and queries
> can then be run against the DOM tree to test for certain conditions,
> etc.  Performance is greatly affected here by what kinds of tests you
> will run against your XML tree.  This is a point of contention for those
> who advocate XML automatically written out from a model since not all
> object models will result in XML that is efficient to query.  IMHO - a
> balance has to be struck between the modellers requirements and the
> programmers/system administrators.  Anyways, XML like this:
> 
> <root>
>   <tag one/>
>   <tag two/>
>   <tag three/>
> </root>
> 
> will be easier on processor speed that this:
> 
> <root>
>   <tag one>
>      <tag two>
>         <tag three/>
>      </tag two>
>      <tag two>
>          <tag three/>
>      ...
> 
> if you are iterating through a deep tree looking for matches.
> 
> Summary:
> 
> I have studied the performance issues for a lot of years and will attest
> that it is an extremely complex issue and the truths about it change
> almost monthly as parsers are upgraded, new chipsets come out, new API's
> to the O/S are used, newer versions of garbage collection (Java, C#) are
> invented etc.  To be up to date is almost impossible however there are a
> simple set of rules that can probably get you 85% of the way there.
> 
> If you are eager to delve into this subject more thoroughly, please
> contact me offline and I can provide you with some links etc.
> 
> Cheers
> 
> Duane Nickull
> 
> John.Borras@e-Envoy.gsi.gov.uk wrote:
> 
> >
> > TC Members
> >
> > Can anyone provide any pointers or advice to Michael please?
> >
> > If there are any commercial sensitivities about your advice then I'll
> > leave you to negotiate directly with him for providing that advice.
> >
> > John
> >
> >
> > "Mike Hughes" <mwhughes@sandproof.org>
> >
> > 31/12/2003 21:12
> >
> > To
> > <john.borras@e-envoy.gsi.gov.uk>
> > cc
> >
> > Subject
> > Need advice regarding XML performance issues
> >
> >
> >
> >
> >
> >
> >
> >
> > Dear Mr. Borras,
> >
> > I am researching performance issues related to XML and need to speak
> > with an expert.  I understand that you Chairman of the OASIS
> > e-Government Technical Committee..
> >
> > I would appreciate it if you could reply to this email with the names
> > of people who could provide related input, particular with regard to
> > specific standards and conditions that affect performance very
> > adversely.  Also, I need to understand what measures, commercial or
> > standards-related, are being taken to resolve such performance problems.
> >
> > Thank you for any assistance you can provide.
> >
> > Sincerely,
> >
> > Michael W. Hughes
> > Amplicast
> > Erie, CO
> > USA
> >
> >
> > PLEASE NOTE: THE ABOVE MESSAGE WAS RECEIVED FROM THE INTERNET.
> >
> > On entering the GSI, this email was scanned for viruses by the
> > Government Secure Intranet (GSI) virus scanning service supplied
> > exclusively by Cable & Wireless in partnership with MessageLabs.
> >
> > GSI users see
> > http://www.gsi.gov.uk/main/notices/information/gsi-003-2002.pdf for
> > further details. In case of problems, please call your organisational
> > IT helpdesk.
> >
> 
> --
> Senior Standards Strategist
> Adobe Systems, Inc.
> http://www.adobe.com
> 
> To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/egov/members/leave_workgroup.php.
> 
>   ------------------------------------------------------------------------
>                                        Name: XML_PERFORMANCE_BENCHMARK.pdf
>    XML_PERFORMANCE_BENCHMARK.pdf       Type: Acrobat (application/pdf)
>                                    Encoding: base64
>                                 Description: XML_PERFORMANCE_BENCHMARK.pdf
> 
>                         Name: stresstest.jpg
>    stresstest.jpg       Type: JPEG Image (image/jpeg)
>                     Encoding: base64
>                  Description: stresstest.jpg
> 
>   ------------------------------------------------------------------------
> To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/egov/members/leave_workgroup.php.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]