Subscribe to Stuck in an Infiniteloop        RSS Feed
-----

Parsing XML in Java Part 3: StAX

Icon 3 Comments
If you have not read part one or two (covering SAX and DOM, respectively) I encourage you to do so. It will help illustrate why StAX was created in the first place.

StAX, short for Streaming API for XML is the "middle ground" between the polar opposites of SAX and DOM paradigms. For a quick recap, SAX is a sequential "push" parser that has low overhead, but only does a "once through" of an XML source. DOM, on the other end of the spectrum, creates an in memory document/object of the source and can be accessed any time, anywhere in the document, but with the cost of additional overhead.

StAX is a "pull" parser. Much like the Java Scanner, you can iterate through the source, back and forth, start and stop anytime you want. This is especially handy when you know the structure of the XML file ahead of time and only want a "piece" (i.e. SAX would require a run through of the entire file and DOM creates the entire tree regardless of your intentions).

As of Java 1.6 StAX is part of the standard API; however, if for whatever reason you aren't with the times you can download it as a separate jar here.

The specific JavaDoc can be found here. (or through the "standard API" page).


The App:

Same deal as the previous two entries only with StAX as our method of parsing. A UML diagram:

Attached Image

StAX, unlike its close cousin SAX does not require a Handler. We simply have a handle to a stream of the XML file in question.

How StAX Works:

Grabbing information from Dream.In.Code requires a URL address (however, XML parsing can be done from a regular File as well). The user provides the member ID (which can be found by visiting anyone's profile). The path is the same except for that specifier: http://www.dreaminco...l.php?showuser=.

Setting up a StAX Parser in Java:
  • Get an instance of XMLInputFactory
  • Create an XMLStreamReader from source URL/File
  • Read stream


In code this looks like:

factory = XMLInputFactory.newInstance();
//...
url = new URL(path+userID);
reader = factory.createXMLStreamReader(url.openStream());
//read the stream 



What's great about StAX is that is completely up to you how to stream/handle the data, as opposed to writing a Handler or building an entire Document in memory.

Specifically:

Quote

StAX was designed as a median between [SAX and DOM]. [In StAX], the programmatic entry point is a cursor that represents a point within the document. The application moves the cursor forward - 'pulling' the information from the parser as it needs.


If you wanted to read through the whole document, all you would need to do is this:

while(reader.hasNext(){
    //specifics here
}



At the most basic level we have the start of an element, the contents of the element, and the end of the element. StAX has a constant for each "tag type" you may run across. A list of the ones we are going to use below:

XMLStreamConstants.START_ELEMENT
XMLStreamConstants.END_ELEMENT
XMLStreamConstants.CHARACTERS



The intuitive of you will notice that this is the exact same setup as SAX's Handler.

The steps for each element are as follows:
  • Check to see if there are elements remaining
  • Obtain the tag constant
  • Take an action depending on the tag type
  • Continue until end of document or whenever you are done gathering data


Which looks like this (specifics omitted for simplicity):

private void readStream(){
        try{
            int tagType;
            String temp = "", tagName = "";
            //read through it all, we'll break prematurely
            //right after "join date" close tag
            while(reader.hasNext()){
                //next() returns the "type" of constant
                tagType = reader.next();
                //start, end, etc...
                switch(tagType){
                    case XMLStreamConstants.START_ELEMENT:
                        tagName = reader.getLocalName();
                        //do anything with attributes, etc...
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        temp = reader.getText();
                        //do something with element data
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        //decide if we should exit/clean up, etc...
                        break;
                }
            }
            //clean up after ourselves
            reader.close();
        }
        catch(Exception e){
            System.out.println("Failure in the stream reading!");
            e.printStackTrace();
        }
    }



Some screenshots of the program in action (visually equivalent to the DOM pictures from Part 2):

Attached Image

Source:

/**
 * @author Knowles
 * Sample StAX Parser entry point
 */

public class Reader {
    public static void main(String[] args){
        MemberPanelSTAX test = new MemberPanelSTAX();
    }
}




import java.awt.Color;
import java.awt.Dimension;
import java.awt.GridLayout;
import java.awt.Toolkit;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.IOException;
import java.net.URL;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JOptionPane;
import javax.swing.JPanel;
import javax.swing.JTextField;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

/**
 *
 * @author Knowles
 * Streaming SAX XML Parser
 */
public class MemberPanelSTAX extends JFrame {

    //StAX
    private final String path = "http://www.dreamincode.net/forums/xml.php?showuser=";
    URL url;
    XMLInputFactory factory;
    XMLStreamReader reader;
    //GUI
    private JPanel dataHolder;
    private JLabel name, joinDate, group, numPosts, pic;
    private JTextField memberInput;
    private JButton parseMember;
    //location
    private Dimension screenCoords;
    private final int APP_WIDTH = 200, APP_HEIGHT = 400;
    //Object
    private DICHead thePerson;

    public MemberPanelSTAX(){
        //Parser setup
        try{
            //save this for later
            factory = XMLInputFactory.newInstance();
            //single instance, no handler
            thePerson = new DICHead();
        }
        catch (Exception e){
            e.printStackTrace();
        }
        //GUI
        screenCoords = Toolkit.getDefaultToolkit().getScreenSize();
        setSize(APP_WIDTH, APP_HEIGHT);
        setTitle("DIC XML");
        setLocation(screenCoords.width/2 - APP_WIDTH/2, screenCoords.height/2 - APP_HEIGHT/2);
        setDefaultCloseOperation(EXIT_ON_CLOSE);
        getContentPane().setLayout(new GridLayout(2,1));
        dataHolder = new JPanel();
        pic = new JLabel();
        name = new JLabel("Name: ");
        joinDate = new JLabel("Join Date: ");
        group  = new JLabel("Group: ");
        numPosts = new JLabel("Total Posts: ");
        memberInput = new JTextField("Enter user number...");
        parseMember = new JButton("Parse Details");
        parseMember.addActionListener(new ActionListener(){
            public void actionPerformed(ActionEvent e){
                //specfifc error catching--user information
                try{
                    int userID = Integer.parseInt(memberInput.getText());
                    if (userID <= 0) throw new NumberFormatException();
                    url = new URL(path+userID);
                    reader = factory.createXMLStreamReader(url.openStream());
                    readStream();
                    fillOutDetails();
                }
                catch(NumberFormatException ex){
                    JOptionPane.showMessageDialog(null, "Please enter a valid number");
                }
                catch (XMLStreamException er){
                    JOptionPane.showMessageDialog(null, "Error Parsing. Please try again.");
                }
                catch (IOException err){
                    JOptionPane.showMessageDialog(null, "IO Issue. Please try again");
                }
            }
        }
        );

        dataHolder.setLayout(new GridLayout(6,1));
        dataHolder.add(name);
        dataHolder.add(joinDate);
        dataHolder.add(group);
        dataHolder.add(numPosts);
        dataHolder.add(memberInput);
        dataHolder.add(parseMember);

        add(pic);
        add(dataHolder);
        validate();
        setVisible(true);
    }

    public void fillOutDetails(){
        pic.setIcon(thePerson.getPicture());
        name.setText("Name: " + thePerson.getName());
        joinDate.setText("Join Date: " + thePerson.getJoinDate());
        group.setForeground(thePerson.getGroupColor());
        group.setText("Group: " + thePerson.getGroup());
        numPosts.setText("Total Posts: " + thePerson.getTotalPosts());
    }
    
    private void readStream(){
        try{
            int tagType;
            boolean notDone = true;
            String temp = "", tagName = "", color = "";
            //read through it all, we'll break prematurely
            //right after "join date" close tag
            while(reader.hasNext() && notDone){
                //next() returns the "type" of constant
                tagType = reader.next();
                //start, end, etc...
                switch(tagType){
                    case XMLStreamConstants.START_ELEMENT:
                        tagName = reader.getLocalName();
                        //the presence of a span tag is
                        //indicative of a color i.e. anything but "Members"
                        if(tagName.equals("span")){
                            //only one attribute "style"
                            color = reader.getAttributeValue(0);
                        }
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        temp = reader.getText();
                        //same deal with SAX, except no Handler reuqired
                        if(tagName.equals("name")){
                            thePerson.setName(temp);
                        }
                        else if(tagName.equals("photo")){
                            thePerson.setImage(temp);
                        }
                        //some members don't have a "color"
                        //thus no "span" tag
                        else if(tagName.equals("group")){
                            thePerson.setGroup(temp);
                            thePerson.setColor(Color.BLACK);
                        }
                        else if(tagName.equals("span")){
                            thePerson.setGroup(temp);
                            //color setting
                            //still a consistency issue
                            //see DOM blog post [part 2]
                            if(temp.equals("Moderators")){
                                thePerson.setColor(Color.BLUE);
                            }
                            else if (temp.equals("Admins")){
                                thePerson.setColor(Color.GREEN.darker());
                            }
                            else{
                                color = color.substring(6, 13); //grab the HTML color code
                                color = color.substring(1); //get rid of the '#'
                                thePerson.setColor(new Color(Integer.parseInt(color, 16)));
                            }
                        }
                        else if(tagName.equals("posts")){
                            thePerson.setNumPosts(temp);
                        }
                        else if(tagName.equals("joined")){
                            thePerson.setJoinDate(temp);
                        }
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        if(tagName.equals("joined")){
                            //debug
                            //System.out.println("end joined tag hit, quitting early!");
                            notDone = false;
                        }
                        //avoid some URL issues
                        temp = "";
                        tagName = "";
                        break;
                }
            }
            //clean up after ourselves
            reader.close();
        }
        catch(Exception e){
            System.out.println("Failure in the stream reading!");
            e.printStackTrace();
        }
    }
}




import java.awt.Color;
import java.net.URL;
import javax.swing.ImageIcon;

/**
 *
 * @author Knowles
 * Encapsulates information retrieved from Dream.In.Code profiles
 */
public class DICHead{
    private String name;
    private String joinDate;
    private String group;
    private String numPosts;
    private ImageIcon pic;
    private Color groupColor;

    public DICHead(){
        name = joinDate = group = numPosts = "";
        groupColor = Color.BLACK;
    }

    //mutators (for parser)
    public void setName(String name)            { this.name = name;}
    public void setGroup(String group)          { this.group = group;}
    public void setJoinDate(String joinDate)    { this.joinDate = joinDate;}
    public void setNumPosts(String numPosts)    { this.numPosts = numPosts;}
    public void setColor(Color groupColor)      {this.groupColor = groupColor;}

    //accessors
    public String getName()                     {return name;}
    public String getJoinDate()                 {return joinDate;}
    public String getGroup()                    {return group;}
    public String getTotalPosts()               {return numPosts;}
    public ImageIcon getPicture()               {return pic;}
    public Color getGroupColor()                {return groupColor;}

    public void setImage(String img){
        try{
            URL url = new URL(img);
            pic = new ImageIcon(url);
        }
        catch(Exception e){
            e.printStackTrace();
        }
    }

    //for debug purposes
    public void display(){
        System.out.println("Name: " + name);
        System.out.println("Group: " + group);
        System.out.println("Join Date: " + joinDate);
        System.out.println("Total Posts: " + numPosts);
    }
}



Happy Coding!

3 Comments On This Entry

Page 1 of 1

dorknexus Icon

10 July 2010 - 03:01 PM
let's make beautiful children KYA.
0

KYA Icon

10 July 2010 - 08:57 PM
I'm guessing that was a "this post was pretty sweet" comment right? ...right? ;)
0

dorknexus Icon

11 July 2010 - 01:46 AM
yes, i approve.
0
Page 1 of 1

July 2014

S M T W T F S
  12345
6789101112
13 141516171819
20212223242526
2728293031  

Tags

    Recent Entries

    Recent Comments

    Search My Blog

    0 user(s) viewing

    0 Guests
    0 member(s)
    0 anonymous member(s)