Subscribe to Stuck in an Infiniteloop        RSS Feed
-----

Parsing XML In Java Part 2: DOM

Icon Leave Comment
Part 1, covering SAX parsing, can be found here.

DOM, short for Document Object Model is the the second popular XML parsing solution.

The main difference between DOM and SAX is the fact that DOM stores the entire document in memory, allowing one to "revisit" data without the need to parse again. Of course this results in higher memory usage, but is appropriate when the document must be accessed repeatedly or out of sequence.

The App:

If you did not read part one, I encourage you to do so. I'm using the same program, but substituting DOM in for SAX. A UML diagram:

Attached Image

You'll notice that DOM parsing does not require a Handler. In a sense, with DOM parsing, the design is cleaner with this version since the GUI maintains its own instance of DICHead that is updated upon parse requests, but I'm getting ahead of myself:

How DOM Works:

Grabbing information from Dream.In.Code requires a URL address (however, XML parsing can be done from a regular File as well). The user provides the member ID (which can be found by visiting anyone's profile). The path is the same except for that specifier: http://www.dreaminco...l.php?showuser=.

Setting up a DOM Parser in Java:
  • Get an instance of DocumentBuilderFactory
  • Assign an instance of DocumentBuilder from yourFactoryVariable.newDocumentBuilder()
  • Assign the result of yourDocumentBuilder.parse() to a Document variable


In code that would look like:
//out of order w/o appropriate catches for illustrative purposes
private DocumentBuilderFactory factory;
private DocumentBuilder parser;
private Document dom;
//...
factory = DocumentBuilderFactory.newInstance();
parser = factory.newDocumentBuilder();
//...
dom = parser.parse(path+userID);



A Document contains (from the specification):

Quote

The DOM presents documents as a hierarchy of Node objects that also implement other, more specialized interfaces. Some types of nodes may have child nodes of various types, and others are leaf nodes that cannot have anything below them in the document structure...

The DOM also specifies a NodeList interface to handle ordered lists of Nodes,


At running the risk of repeating myself, this means that the entire "tree" is stored in memory after our call to parse(). Thus, we can jump to the end, swim in the middle, or ignore it completely, it's up to us. We have no limitations, at least from the parsing end.

The DIC XML Structure:

Open up a profile from the link above. Notice the layout. Since all we want is the profile information, our lives are really easy: there's only one <profile> tag (if there were more it wouldn't make sense).

Example:

<ipb>
<profile>
<id>46711</id>
<name>KYA</name>
//....etc.....



Since there's only one node we want to deal with, there isn't any need for fancy tree traversing or any of its ilk.

Steps to get basic profile information:
  • Obtain a handle to the "root"
  • Get a List of nodes that have the information we want [use their tagName]
  • Get a handle to a node in the list (in our case only one, the first)
  • Grab the details (see below)


Which looks like this:

private void updatePerson(){
        Element root = dom.getDocumentElement();
        NodeList names = root.getElementsByTagName("profile");
        if(names != null){
            //grab only the first one out of the tree
            //there's only one in our case anyway <profile>
            Element first = (Element)names.item(0); 
            fillOutDetails(first);
        }
    }



We now have a handle to the profile node, our "root". We now need to extract the information out of the child nodes (Name, Date Joined, Photo, Group, and Total Posts). We could access the childNode in the above method, but that's not really what we want. We want specific text information associated with each tag. Since this would involve multiple calls to do inherently the same steps (with differing tag specifiers), let's throw it into its own method.

Steps:
  • Get a NodeList associated with the tag we want
  • Get an Element handle to the first instance in our list
  • Assign the text value to a String
  • Return the String


Again, we are getting off easy since there is only one of each node in the tree. As long as the node list is not null we can safely assume the first element is the tag we want (based on studying the XML structure in the profile).

In code:

private String getTextValue(Element ele, String tagName) {
		String textVal = null;
		NodeList nl = ele.getElementsByTagName(tagName);
		if(nl != null) {
			Element el = (Element)nl.item(0);
			textVal = el.getFirstChild().getNodeValue();
		}
		return textVal;
	}



The intuitive of you might ask: why are we using getNodeValue() instead of getTextContent()? In the API, it defines the return status of the latter as NodeValue depending on the type of node. Since we know ahead of time we're only dealing with text, that's a viable option; however, in the event we might deal with other node types getNodeValue() is a better choice.


Adding a Splash of Color:

The app is a little bland, so let's add a little color, specifically the group color of the profile in question. The XML contains the color data <span> tag attributes; span is a child node of group. Since we have been dealing with "single" tags it is better to use "span" as the tag name rather then "group", otherwise we'd have to write a separate method for getting past that second "layer".

We call the method with just the node since Java doesn't support default parameters and pass into the actual working method with the tagName "span". The tag looks like this:

<span style="color:blue; font-style: italic; font-weight: bold">Moderators</span>

//or 

<span style="color:#DF0000; font-style: italic; font-weight: bold">Webmaster</span>



Notice the different in color coding, I'll address this later.

We need to get the attributes of the span tag, and then substring to get the HTML color code which we then parse with the Integer class (HTML color codes are Hexadecimal, base 16) into a RGB value to pass to a new Color object.

However, since the Moderators and Admins have a color:colorName instead of an HTML code, we grab the text of the tag (the actual group name is the result) and then hard code in two conditionals. There really isn't a good way (that I've been able to find) to change a String into a Color.

It's prudent to have a backup plan in case the method fails, in this case we default to Black in case something goes amiss.

private Color getGroupColor(Element ele){
        Color test = getColor(ele, "span"); //Java doesn't have default parameters
        if (test == null) return Color.BLACK; //default
        else return test; //otherwise color! 
    }

    private Color getColor(Element ele, String tagName){
        NodeList nl = ele.getElementsByTagName(tagName);
        if (nl != null){
            try{
                Element el = (Element)nl.item(0);
                String data = el.getAttribute("style");
                String groupType = el.getParentNode().getTextContent();
                //inconsistency is XML attribute coding
                //admins and mods have color:blue/green tag
                //everyone else has an HTML Hex color code
                //hardcoded until fixed server side
                if(groupType.equals("Moderators")){
                    return Color.BLUE;
                }
                else if (groupType.equals("Admins")){
                    return Color.GREEN.darker();
                }
                else{
                    String color = data.substring(6, 13); //grab the HTML color code
                    color = color.substring(1); //get rid of the '#'
                    return new Color(Integer.parseInt(color, 16));
                }
            }
            catch(Exception e){
                //catch it here for a specific message rather then in the actionlistener
                JOptionPane.showMessageDialog(null, "Error retrieving color," +
                        " defaulting to black.");
                return null;
            }
        }
        else return null;
    }




Some screen shots of the program (now in Technicolor!):

Attached Image

Source:

/**
 * @author Knowles
 * Sample DOM Parser entry point
 */

public class Reader {
    public static void main(String[] args){
        MemberPanelDOM test = new MemberPanelDOM();
    }
}



import java.awt.Color;
import java.awt.Dimension;
import java.awt.GridLayout;
import java.awt.Toolkit;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.io.IOException;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JOptionPane;
import javax.swing.JPanel;
import javax.swing.JTextField;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

/**
 *
 * @author Knowles
 * Same as MemberPanel except for DOM XML parsing
 */
public class MemberPanelDOM extends JFrame{

    //Our SAX goodies
    private final String path = "http://www.dreamincode.net/forums/xml.php?showuser=";
    private DocumentBuilderFactory factory;
    private DocumentBuilder parser;
    private Document dom;
    //GUI
    private JPanel dataHolder;
    private JLabel name, joinDate, group, numPosts, pic;
    private JTextField memberInput;
    private JButton parseMember;
    //location
    private Dimension screenCoords;
    private final int APP_WIDTH = 200, APP_HEIGHT = 400;
    //Object
    private DICHead thePerson;

    public MemberPanelDOM(){
        //Parser setup
        try{
            factory = DocumentBuilderFactory.newInstance();
            parser = factory.newDocumentBuilder();
            //single instance, no handler 
            thePerson = new DICHead();
        }
        catch (Exception e){
            e.printStackTrace();
        }
        //GUI
        screenCoords = Toolkit.getDefaultToolkit().getScreenSize();
        setSize(APP_WIDTH, APP_HEIGHT);
        setTitle("DIC XML");
        setLocation(screenCoords.width/2 - APP_WIDTH/2, screenCoords.height/2 - APP_HEIGHT/2);
        setDefaultCloseOperation(EXIT_ON_CLOSE);
        getContentPane().setLayout(new GridLayout(2,1));
        dataHolder = new JPanel();
        pic = new JLabel();
        name = new JLabel("Name: ");
        joinDate = new JLabel("Join Date: ");
        group  = new JLabel("Group: ");
        numPosts = new JLabel("Total Posts: ");
        memberInput = new JTextField("Enter user number...");
        parseMember = new JButton("Parse Details");
        parseMember.addActionListener(new ActionListener(){
            public void actionPerformed(ActionEvent e){
                //specfifc error catching--user information
                try{
                    int userID = Integer.parseInt(memberInput.getText());
                    if (userID <= 0) throw new NumberFormatException();
                    dom = parser.parse(path+userID);
                    updatePerson();
                }
                catch(NumberFormatException ex){
                    JOptionPane.showMessageDialog(null, "Please enter a valid number");
                }
                catch (SAXException er){
                    JOptionPane.showMessageDialog(null, "Error Parsing. Please try again.");
                }
                catch (IOException err){
                    JOptionPane.showMessageDialog(null, "IO Issue. Please try again");
                }
            }
        }
        );

        dataHolder.setLayout(new GridLayout(6,1));
        dataHolder.add(name);
        dataHolder.add(joinDate);
        dataHolder.add(group);
        dataHolder.add(numPosts);
        dataHolder.add(memberInput);
        dataHolder.add(parseMember);

        add(pic);
        add(dataHolder);
        validate();
        setVisible(true);
    }

    private void updatePerson(){
        Element root = dom.getDocumentElement();
        NodeList names = root.getElementsByTagName("profile");
        if(names != null){
            //grab only the first one out of the tree
            //there's only one in our case anyway <profile>
            Element first = (Element)names.item(0); 
            fillOutDetails(first);
        }
    }

    private void fillOutDetails(Element node){
        thePerson.setImage(getTextValue(node, "photo"));
        thePerson.setName(getTextValue(node, "name"));
        thePerson.setJoinDate(getTextValue(node, "joined"));
        //since the span tag is inside the group tag, use the "inner most" tag
        //doesn't apply to other data, they are solo tags
        thePerson.setGroup(getTextValue(node, "span"));
        thePerson.setColor(getGroupColor(node));
        thePerson.setNumPosts(getTextValue(node, "posts"));
        pic.setIcon(thePerson.getPicture());
        name.setText("Name: " + thePerson.getName());
        joinDate.setText("Join Date: " + thePerson.getJoinDate());
        group.setForeground(thePerson.getGroupColor());
        group.setText("Group: " + thePerson.getGroup());
        numPosts.setText("Total Posts: " + thePerson.getTotalPosts());
    }
    
    private String getTextValue(Element ele, String tagName) {
		String textVal = null;
		NodeList nl = ele.getElementsByTagName(tagName);
		if(nl != null) {
			Element el = (Element)nl.item(0);
			textVal = el.getFirstChild().getNodeValue();
		}
		return textVal;
	}

    private Color getGroupColor(Element ele){
        Color test = getColor(ele, "span"); //Java doesn't have default parameters
        if (test == null) return Color.BLACK; //default
        else return test; //otherwise color! 
    }

    private Color getColor(Element ele, String tagName){
        NodeList nl = ele.getElementsByTagName(tagName);
        if (nl != null){
            try{
                Element el = (Element)nl.item(0);
                String data = el.getAttribute("style");
                String groupType = el.getParentNode().getTextContent();
                //inconsistency is XML attribute coding
                //admins and mods have color:blue/green tag
                //everyone else has an HTML Hex color code
                //hardcoded until fixed server side
                if(groupType.equals("Moderators")){
                    return Color.BLUE;
                }
                else if (groupType.equals("Admins")){
                    return Color.GREEN.darker();
                }
                else{
                    String color = data.substring(6, 13); //grab the HTML color code
                    color = color.substring(1); //get rid of the '#'
                    return new Color(Integer.parseInt(color, 16));
                }
            }
            catch(Exception e){
                //catch it here for a specific message rather then in the actionlistener
                JOptionPane.showMessageDialog(null, "Error retrieving color," +
                        " defaulting to black.");
                return null;
            }
        }
        else return null;
    }
}



import java.awt.Color;
import java.net.URL;
import javax.swing.ImageIcon;

/**
 *
 * @author Knowles
 * Encapsulates information retrieved from Dream.In.Code profiles
 */
public class DICHead{
    private String name;
    private String joinDate;
    private String group;
    private String numPosts;
    private ImageIcon pic;
    private Color groupColor;

    public DICHead(){
        name = joinDate = group = numPosts = "";
        groupColor = Color.BLACK;
    }

    //mutators (for parser)
    public void setName(String name)            { this.name = name;}
    public void setGroup(String group)          { this.group = group;}
    public void setJoinDate(String joinDate)    { this.joinDate = joinDate;}
    public void setNumPosts(String numPosts)    { this.numPosts = numPosts;}
    public void setColor(Color groupColor)      {this.groupColor = groupColor;}

    //accessors
    public String getName()                     {return name;}
    public String getJoinDate()                 {return joinDate;}
    public String getGroup()                    {return group;}
    public String getTotalPosts()               {return numPosts;}
    public ImageIcon getPicture()               {return pic;}
    public Color getGroupColor()                {return groupColor;}

    public void setImage(String img){
        try{
            URL url = new URL(img);
            pic = new ImageIcon(url);
        }
        catch(Exception e){
            e.printStackTrace();
        }
    }

    //for debug purposes
    public void display(){
        System.out.println("Name: " + name);
        System.out.println("Group: " + group);
        System.out.println("Join Date: " + joinDate);
        System.out.println("Total Posts: " + numPosts);
    }
}



Happy Coding!

0 Comments On This Entry

 

December 2017

S M T W T F S
     12
3456789
10111213141516
17 18 1920212223
24252627282930
31      

Tags

    Recent Entries

    Search My Blog

    8 user(s) viewing

    8 Guests
    0 member(s)
    0 anonymous member(s)