OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

egov message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [egov] Re: Need advice regarding XML performance issues


Sounds great - however I think the original poster was looking for some
more generic guidance rather than a specific product. :)

Joe

Jouko Salonen wrote:
> 
> Joe
> The test itself has been designed by IBM in 2001. The reported performance test has been run a year ago. Since that XML pull parser (XPP) has been updated and is (as far as we know) currently better.
> 
> The test environment can be downloaded from:
> http://www-106.ibm.com/developerworks/java/library/x-injava/
> 
> We can provide the X-Fetch Performer plugin for above benchmarking framework.
> 
> --Jouko
> 
> -----Original Message-----
> From: Chiusano Joseph [mailto:chiusano_joseph@bah.com]
> Sent: 7. tammikuuta 2004 16:06
> To: Jouko Salonen
> Cc: Duane Nickull; egov@lists.oasis-open.org; mwhughes@sandproof.org
> Subject: Re: [egov] Re: Need advice regarding XML performance issues
> 
> <Quote>
> This document displays a performance comparison between the most common
> XML processing techniques according to the benchmark package published
> in IBM Developerworks in September 2001.
> </Quote>
> 
> Hmmm...over 2 years old...
> 
> Jouko would you have an updated version available that reflects today's
> current environments? Or are you confident that the results still stand
> today?
> 
> Joe
> 
> Jouko Salonen wrote:
> >
> > Dear Mr. Hughes, Duane
> >
> > As Duane says there are different aspects of "performance" in XML parsing and processing.
> > You might want to look at the attached XML Performance Test report and the stress curve picture.
> >
> > The report and further information can be found at:
> >
> > http://www.x-fetch.com/Component_WhitePapers_PDF/X-Fetch_Performer22_Benchmark.pdf
> >
> > Best regards
> > Jouko Salonen
> > www.reublica.fi
> >
> > -----Original Message-----
> > From: Duane Nickull [mailto:dnickull@adobe.com]
> > Sent: 5. tammikuuta 2004 21:00
> > To: John.Borras@e-Envoy.gsi.gov.uk
> > Cc: egov@lists.oasis-open.org; mwhughes@sandproof.org
> > Subject: Re: [egov] Re: Need advice regarding XML performance issues
> >
> > Michael:
> >
> > Performance is a very loaded term.  We have had huge debates on this
> > going back to 1996 on  the XML-dev list.  I will try to recall some of
> > the points we agreed on.
> >
> > 1. Performance is affected largely by platform, programming language and
> > physical memory (*both heap and stack)
> > 2. Sax is an Event based model.  I have a PPT slide that explains the
> > concept of SAX very clearly at
> > http://www.nickull.net/presentations.html. (download the one entitled
> > Washington - Day Three).  Sax works by reading in an XML document as a
> > one dimensional stream of bytes.  When an enough bytes are read that an
> > "event" is recognized, an event notice is dispatched up the stack.  The
> > event notices are simple text messages that look something like this
> >
> > StartElement=["foo"];
> >
> > The above event is the parsers way of telling the parent that
> > instantiated it that is has encountered a start element named "foo".
> >  Once the event has been dispatched.  No residual memory of the event is
> > kept.  This makes SAX a preferable methodology for parsing when there
> > are strict memory requirement.
> >
> > Since XML does not contain any semantics, a parser is simply a reader.
> >  Nothing is done with the XML except reading it, checking it for errors
> > and resolving entities (three mandatory items) and a fourth optional
> > item of validating it against a DTD or XMl Schema.  The latter also
> > slows down parsing.
> >
> > The Java SAX implementation (Xerces), accordingly has four main handlers
> > (entity resolver, error handler, Validation handler and event handler.
> >  It is up to the programmer to capture all the events that get passed up
> > and do something meaningful with them. *** This is the place where a lot
> > of performance can be gained or lost!!!  Since just about all programs
> > that consume XML documents will eventually do something with them, the
> > skill of the programmer writing the handler code greatly affects things
> > like memory, speed etc.  If you use a language like Java with automatic
> > garbage collection, your memory options are managed for you however you
> > can still tune it further.  If you work in a language like C or C++
> > (ANSI), the skill of the programmer is going to affect your systems
> > performance.
> >
> > 3. If one requires to keep a model of the XML document and run a series
> > of programmatic tests against it, you will likely use the DOM.  DOM
> > (Document Object Model) works by accepting the events from the SAX
> > handler (* although use of sax is not mandatory) and building an in
> > memory representation of the original XML document.  Tests and queries
> > can then be run against the DOM tree to test for certain conditions,
> > etc.  Performance is greatly affected here by what kinds of tests you
> > will run against your XML tree.  This is a point of contention for those
> > who advocate XML automatically written out from a model since not all
> > object models will result in XML that is efficient to query.  IMHO - a
> > balance has to be struck between the modellers requirements and the
> > programmers/system administrators.  Anyways, XML like this:
> >
> > <root>
> >   <tag one/>
> >   <tag two/>
> >   <tag three/>
> > </root>
> >
> > will be easier on processor speed that this:
> >
> > <root>
> >   <tag one>
> >      <tag two>
> >         <tag three/>
> >      </tag two>
> >      <tag two>
> >          <tag three/>
> >      ...
> >
> > if you are iterating through a deep tree looking for matches.
> >
> > Summary:
> >
> > I have studied the performance issues for a lot of years and will attest
> > that it is an extremely complex issue and the truths about it change
> > almost monthly as parsers are upgraded, new chipsets come out, new API's
> > to the O/S are used, newer versions of garbage collection (Java, C#) are
> > invented etc.  To be up to date is almost impossible however there are a
> > simple set of rules that can probably get you 85% of the way there.
> >
> > If you are eager to delve into this subject more thoroughly, please
> > contact me offline and I can provide you with some links etc.
> >
> > Cheers
> >
> > Duane Nickull
> >
> > John.Borras@e-Envoy.gsi.gov.uk wrote:
> >
> > >
> > > TC Members
> > >
> > > Can anyone provide any pointers or advice to Michael please?
> > >
> > > If there are any commercial sensitivities about your advice then I'll
> > > leave you to negotiate directly with him for providing that advice.
> > >
> > > John
> > >
> > >
> > > "Mike Hughes" <mwhughes@sandproof.org>
> > >
> > > 31/12/2003 21:12
> > >
> > > To
> > > <john.borras@e-envoy.gsi.gov.uk>
> > > cc
> > >
> > > Subject
> > > Need advice regarding XML performance issues
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Dear Mr. Borras,
> > >
> > > I am researching performance issues related to XML and need to speak
> > > with an expert.  I understand that you Chairman of the OASIS
> > > e-Government Technical Committee..
> > >
> > > I would appreciate it if you could reply to this email with the names
> > > of people who could provide related input, particular with regard to
> > > specific standards and conditions that affect performance very
> > > adversely.  Also, I need to understand what measures, commercial or
> > > standards-related, are being taken to resolve such performance problems.
> > >
> > > Thank you for any assistance you can provide.
> > >
> > > Sincerely,
> > >
> > > Michael W. Hughes
> > > Amplicast
> > > Erie, CO
> > > USA
> > >
> > >
> > > PLEASE NOTE: THE ABOVE MESSAGE WAS RECEIVED FROM THE INTERNET.
> > >
> > > On entering the GSI, this email was scanned for viruses by the
> > > Government Secure Intranet (GSI) virus scanning service supplied
> > > exclusively by Cable & Wireless in partnership with MessageLabs.
> > >
> > > GSI users see
> > > http://www.gsi.gov.uk/main/notices/information/gsi-003-2002.pdf for
> > > further details. In case of problems, please call your organisational
> > > IT helpdesk.
> > >
> >
> > --
> > Senior Standards Strategist
> > Adobe Systems, Inc.
> > http://www.adobe.com
> >
> > To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/egov/members/leave_workgroup.php.
> >
> >   ------------------------------------------------------------------------
> >                                        Name: XML_PERFORMANCE_BENCHMARK.pdf
> >    XML_PERFORMANCE_BENCHMARK.pdf       Type: Acrobat (application/pdf)
> >                                    Encoding: base64
> >                                 Description: XML_PERFORMANCE_BENCHMARK.pdf
> >
> >                         Name: stresstest.jpg
> >    stresstest.jpg       Type: JPEG Image (image/jpeg)
> >                     Encoding: base64
> >                  Description: stresstest.jpg
> >
> >   ------------------------------------------------------------------------
> > To unsubscribe from this mailing list (and be removed from the roster of the OASIS TC), go to http://www.oasis-open.org/apps/org/workgroup/egov/members/leave_workgroup.php.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]