Subscribe to Stuck in an Infiniteloop        RSS Feed

Parsing XML In Java Part 1: SAX

Icon 5 Comments
SAX, short for Simple API for XML is one of two common ways to retrieve information from an XML formatted document.

Stemming from this thread, I decided to throw together a little application that retrieves the profile picture, handle (nickname), join date, and total posts from a user here at Dream.In.Code.

SAX, unlike DOM (the other popular method for XML parsing), does not store the entire document in memory. It moves through the document linearly, the only way to "go back" to previously read information (assuming you did not store it somewhere when read) is to begin the parsing operation again. On that same token, SAX typically takes less memory then a comparable DOM operation.

Fun fact: The Java API implementation of SAX (SAXParser) was the "original" and is now the de facto standard.

The App:

There are four "pieces" to this application. The entry point that creates the GUI, the handler for the SAX parser (more on this later), the class blueprint for a "DICHead", and the actual GUI class itself. Rather then throw a huge wall of text at you, here is a simple UML diagram:

Attached Image

How SAX works:

Grabbing information from Dream.In.Code requires a URL address (however, XML parsing can be done from a regular File as well). The user provides the member ID (which can be found by visiting anyone's profile). The path is the same except for that specifier: http://www.dreaminco...l.php?showuser=.

Setting up a SAXParser in Java is relatively simple:
  • Create a new SAXParserFactory
  • Obtain an instance from the factory/assign to your local parser variable
  • Create an instance of your Handler extension
  • Call parse(path, yourHandler)

In Java, the SAXParser requires a "Handler". The one provided regularly is DefualtHandler. It does nothing. Thus, if you want any meaningful action to be taken during each "token" you'll need to create your own and have it extend DefaultHandler.

The sequence for each token/element is as follows:

Attached Image

Capturing the actual data occurs in the characters() method, whereas assignment typically occurs in the endElement() method. There is no "clean" way to "quit" a parse early. You'll notice that a counter is in place in my PrintHandler; this is due to the fact that there are multiple "name" tags in the document (the first is yours followed by friends and profile comments) and the application only wants the main person's info.

If the parser has only encountered a "name" tag once, then at the end of each appropriate element that we want, the data written to our temporary string is added to the handler's instance of DICHead. Once we are done parsing (the whole document since we can't stop it), the handler returns its object instance to the MemberPanel which gets its own copy. PrintHandler maintains one instance of DICHead at any given time, updating its information as dictated by the user.

Some screen shots of the program in action:

Attached Image


 * @author Knowles
 * Sample SAX Parser entry point

public class Reader {
    public static void main(String[] args){
        MemberPanel test = new MemberPanel();

import javax.swing.ImageIcon;

 * @author Knowles
 * Encapsulates information retrieved from Dream.In.Code profiles
public class DICHead{
    private String name;
    private String joinDate;
    private String group;
    private String numPosts;
    private ImageIcon pic;

    public DICHead(){
        name = joinDate = group = numPosts = "";

    //mutators (for parser)
    public void setName(String name)            { = name;}
    public void setGroup(String group)          { = group;}
    public void setJoinDate(String joinDate)    { this.joinDate = joinDate;}
    public void setNumPosts(String numPosts)    { this.numPosts = numPosts;}

    public String getName()                     {return name;}
    public String getJoinDate()                 {return joinDate;}
    public String getGroup()                    {return group;}
    public String getTotalPosts()               {return numPosts;}
    public ImageIcon getPicture()               {return pic;}

    public void setImage(String img){
            URL url = new URL(img);
            pic = new ImageIcon(url);
        catch(Exception e){

    //for debug purposes
    public void display(){
        System.out.println("Name: " + name);
        System.out.println("Group: " + group);
        System.out.println("Join Date: " + joinDate);
        System.out.println("Total Posts: " + numPosts);

import org.xml.sax.Attributes;
import org.xml.sax.helpers.DefaultHandler;

 * @author Knowles
 * Modification to print to the console [debugging purposes]
 * and to create and fill an object with data for the GUI
public class PrintHandler extends DefaultHandler{
    DICHead inputBuffer;
    private String tempVal;
    private int counter;

    public PrintHandler(){
       inputBuffer = new DICHead();
       counter = 0;

    public void startElement(String uri, String localName, String qName,
            Attributes attrib){
            counter++; //used to only get the "main" profile, see below

    public void characters(char[] ch, int start, int length){
		tempVal = new String(ch,start,length);

    public void endElement(String uri, String localName, String qName){
        //Since there's no "good" way to stop a SAX parse
        //use the counter to ensure only the main profile tid bits
        if(counter  == 1){
            if (qName.equals("name")){
            else if (qName.equals("group")){
            else if (qName.equals("posts")){
            else if (qName.equals("joined")){
            else if (qName.equals("photo")){
    DICHead getPerson() {
        //reset for additional parsing
        counter = 0;
        return inputBuffer;

import java.awt.Dimension;
import java.awt.GridLayout;
import java.awt.Toolkit;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JOptionPane;
import javax.swing.JPanel;
import javax.swing.JTextField;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;

 * @author Knowles
 * GUI for D.I.C. XML parsing
public class MemberPanel extends JFrame{

    //Our SAX goodies
    private final String path = "";
    private SAXParserFactory factory;
    private SAXParser  parser;
    private PrintHandler handler;
    private JPanel dataHolder;
    private JLabel name, joinDate, group, numPosts, pic;
    private JTextField memberInput;
    private JButton parseMember;
    private Dimension screenCoords;
    private final int APP_WIDTH = 200, APP_HEIGHT = 400;
    private DICHead thePerson;

    public MemberPanel(){
        //Parser setup
            handler = new PrintHandler();
            factory = SAXParserFactory.newInstance();
            parser = factory.newSAXParser();
        catch (Exception e){
        screenCoords = Toolkit.getDefaultToolkit().getScreenSize();
        setSize(APP_WIDTH, APP_HEIGHT);
        setTitle("DIC XML");
        setLocation(screenCoords.width/2 - APP_WIDTH/2, screenCoords.height/2 - APP_HEIGHT/2);
        getContentPane().setLayout(new GridLayout(2,1));
        dataHolder = new JPanel();
        pic = new JLabel();
        name = new JLabel("Name: ");
        joinDate = new JLabel("Join Date: ");
        group  = new JLabel("Group: ");
        numPosts = new JLabel("Total Posts: ");
        memberInput = new JTextField("Enter user number...");
        parseMember = new JButton("Parse Details");
        parseMember.addActionListener(new ActionListener(){
            public void actionPerformed(ActionEvent e){
                //specfifc error catching--user information
                    int userID = Integer.parseInt(memberInput.getText());
                    if (userID <= 0) throw new NumberFormatException();
                    parser.parse(path+userID, handler);
                    //for console debugging purposes
                catch(NumberFormatException ex){
                    JOptionPane.showMessageDialog(null, "Please enter a valid number");
                catch (SAXException er){
                    JOptionPane.showMessageDialog(null, "Error Parsing. Please try again.");
                catch (IOException err){
                    JOptionPane.showMessageDialog(null, "IO Issue. Please try again");

        dataHolder.setLayout(new GridLayout(6,1));


    public void updatePerson(DICHead person){
        thePerson = person;

    public void fillOutDetails(){
        name.setText("Name: " + thePerson.getName());
        joinDate.setText("Join Date: " + thePerson.getJoinDate());
        group.setText("Group: " + thePerson.getGroup());
        numPosts.setText("Total Posts: " + thePerson.getTotalPosts());

Adding color and some pizazz is on the to-do list. Design wise, I could get rid of the DICHead instance in the MemberPanel and access the handler's instance directly; however, in the event that one wants to extend this program and have multiple DICHeads, do something else with the data, etc...keeping the blueprint one in the Handler makes the most sense.

Let me know what you think! Happy coding!

5 Comments On This Entry

Page 1 of 1


21 June 2010 - 03:08 PM
Sweet!!! Nice work! That XML API is kind of fun huh?


21 June 2010 - 03:11 PM
Certainly is! I hadn't done any XML parsing before; killed two metaphorical birds with a stone today.


21 June 2010 - 03:29 PM
Nice!! Could add abit of colour or should I say 'color' for you on the gui :) good work.


21 June 2010 - 04:20 PM
Nice work! Looks good :)



24 June 2010 - 03:35 AM
Good Show KYA :)

xml can be quite handy :)
Page 1 of 1

April 2020

   12 3 4


    Recent Entries

    Recent Comments

    Search My Blog

    3 user(s) viewing

    3 Guests
    0 member(s)
    0 anonymous member(s)