Advanced Java and Web Technologies for JNTUK
Blog providing beginner tutorials on different web technologies like HTML, CSS, Javascript, PHP, MYSQL, XML, Java Beans, Servlets, JSP and AJAX

11/03/2015 Categories: XML. No Comments on XML Processors

This article explains XML Processors and two of the popular XML document parsing APIs namely SAX and DOM. Java example programs are also provided.

XML processors are needed for the following reasons:

  • The processor must check the basic syntax of the document for well-formedness.
  • The processor must replace all occurrences of an entity with its definition.
  • The processor must copy the default values for attributes in a XML document.
  • The processor must check for the validity of the XML document if either a DTD or XML Schema is included.

Although an XML document exhibits a regular and elegant structure, that structure does not provide applications with convenient access to document’s data. This need led to the development of two standard APIs for XML processors: SAX (Simple API for XML) and DOM (Document Object Model).

 

SAX Approach

The SAX standard, released in May 1998, was developed by an XML user group, XML-DEV. SAX has been widely accepted as a de facto standard and is widely supported by XML processors.

The SAX approach to processing is known as event processing. The processor scans the document from beginning to end sequentially. Every time a syntactic structure like opening tag, attributes, text or a closing tag is recognized, the processor signals an event to the application by calling an event handler for the particular structure that was found. The interfaces that describe the event handlers form the SAX API.

Below is an example Java program which reads an XML document using SAX API:

Output:

Start Element :company

Start Element :staff

Start Element :firstname

First Name : yong

End Element :firstname

Start Element :lastname

Last Name : mook kim

End Element :lastname

Start Element :nickname

Nick Name : mkyong

End Element :nickname

Start Element :salary

Salary : 100000

End Element :salary

End Element :staff

Start Element :staff

Start Element :firstname

First Name : low

End Element :firstname

Start Element :lastname

Last Name : yin fong

End Element :lastname

Start Element :nickname

Nick Name : fong fong

End Element :nickname

Start Element :salary

Salary : 200000

End Element :salary

End Element :staff

End Element :company

 

DOM Approach

An alternative to SAX approach is DOM. In a XML processor, the parser builds the DOM tree for an XML document. The nodes of the tree are represented as objects that can be accessed and manipulated by the application.

The advantages of DOM over SAX are:

  • DOM is better for accessing a part of an XML document more than once.
  • DOM is better for rearranging (sorting) the elements in a XML document.
  • DOM is best for random access over SAX’s sequential access.
  • DOM can detect invalid nodes later in the document without any further processing.

The disadvantages of DOM over SAX are:

  • DOM structure (tree) is stored entirely in the memory, so large XML documents require more memory.
  • Large documents cannot be parsed using DOM.
  • DOM is slower when compared to SAX.

Below is an example Java program which reads an XML document using DOM API:

Output:

Root element :company

—————————-

 

Current Element :staff

Staff id : 1001

First Name : yong

Last Name : mook kim

Nick Name : mkyong

Salary : 100000

 

Current Element :staff

Staff id : 2001

First Name : low

Last Name : yin fong

Nick Name : fong fong

Salary : 200000

Suryateja Pericherla

Suryateja Pericherla

Hello, I am Suryateja Pericherla working as an Asst. Professor in CSE department at Vishnu Institute of Technology. I write articles to share my knowledge and make people knowledgeable regarding certain topics.
Suryateja Pericherla

Latest posts by Suryateja Pericherla (see all)

Related Links:

Note: Do you have a question on this article or have a suggestion to make this article better? You can ask or suggest us by filling in the below form. After commenting, your comment will be held for moderation and will be published in 24-48 hrs.

Leave a Reply

Scroll Up