Applications that require access to the entire document at once in order to take useful action would be better served by one of the tree-based APIs like DOM or JDOM.Finally, because SAX is so efficient, it’s the only real choice for truly huge XML documents.Java developers already had adequate Unicode support, however; and thus Java parsers were a lot faster out the gate.Nonetheless, it still probably isn’t possible to write a fully conformant XML parser in a weekend, even in Java. There are several dozen XML parsers available under a variety of licenses that you can use.This makes SAX very fast and very memory efficient (since it doesn’t have to store the entire document in memory).However, SAX programs can be harder to design and code because you normally need to develop your own data structures to hold the content from the document.On the other extreme, the DPH was assumed to be Larry Wall and he was allowed two months for the task.The middle ground was a smart grad student and a couple of weeks.

In fact, it took Larry Wall more than a couple of months just to add the Unicode support to Perl that XML assumed.

SAX, the Simple API for XML, is the gold standard of XML APIs. Given a fully validating parser that supports all its optional features, there is very little you can’t do with it.

It has one or two holes, but they're really off in the weeds of the XML specifications, and you have to look pretty hard to find them. The SAX classes and interfaces model the parser, the stream from which the document is read, and the client application receiving data from the parser. Instead the parser feeds content to the client application through a callback interface, much like the ones used in Swing and the AWT.

Of course, as Fred Brooks taught us, “In most projects, the first system built is barely usable.

It may be too slow, too big, awkward to use, or all three.


