Page 1 of 1

LINQ by Example 5: LINQ to XML Querying

#1 andrewsw  Icon User is online

  • Fire giant boob nipple gun!
  • member icon

Reputation: 3448
  • View blog
  • Posts: 11,693
  • Joined: 12-December 12

Posted 04 September 2014 - 03:55 PM

To complete this tutorial sequence this part demonstrates querying an XML document with LINQ. (Note that I am not attempting to teach XML, just the LINQ-querying of an XML document.)

Other parts:

LINQ by Example 1: Enumerable Methods A-L
LINQ by Example 2: Enumerable Methods M-Z
LINQ by Example 3: Methods Using IEqualityComparer
LINQ by Example 4: Query Expression Syntax

Other DIC Tutorials:

LINQ To SQL Primer
LINQ To XML

Look through all the DIC tutorials as there are others that are about, or include, LINQ.

Microsoft Links:

LINQ to XML
Getting Started (LINQ to XML)
Programming Guide (LINQ to XML)

Basic Queries (LINQ to XML)
(this is the basis for my examples)

From LINQ to XML Overview:

MS said:

In addition to these LINQ (query) capabilities, LINQ to XML provides an improved XML programming interface. Using LINQ to XML, you can:

  • Load XML from files or streams.
  • Serialize XML to files or streams.
  • Create XML from scratch by using functional construction.
  • Query XML using XPath-like axes.
  • Manipulate the in-memory XML tree by using methods such as Add, Remove, ReplaceWith, and SetValue.
  • Validate XML trees using XSD.
  • Use a combination of these features to transform XML trees from one shape into another.

This quote tells us that there is a lot more to LINQ to XML than just querying an existing XML document. In fact, LINQ to XML provides an entire XML Framework (an API). This tutorial concentrates on using LINQ expressions to query XML data that is similar to the data from the earlier parts of this tutorial. Read the DIC tutorial LINQ To XML for the bigger picture.

The Data

Staff.xml
<?xml version="1.0"?>
<Staff>
    <Member Id="21" Dept="Marketing">
        <Name>Bob Bones</Name>
        <Grade>6</Grade>
        <Salary>20000</Salary>
    </Member>
    <Member Id="23" Dept="Sales">
        <Name>Mary Muggins</Name>
        <Grade>6</Grade>
        <Salary>22000</Salary>
        <BonusRate>0.03</BonusRate>
    </Member>
    <Member Id="24" Dept="Sales">
        <Name>Liz Elbow</Name>
        <Grade>5</Grade>
        <Salary>22500</Salary>
        <BonusRate>0.02</BonusRate>
    </Member>
    <Member Id="27" Dept="Accounts">
        <Name>Dave DIddly</Name>
        <Grade>4</Grade>
        <Salary>28000</Salary>
    </Member>
    <Member Id="28" Dept="Marketing">
        <Name>Mary Pickles</Name>
        <Grade>4</Grade>
        <Salary>27000</Salary>
        <BonusRate>0.025</BonusRate>
    </Member>
    <Member Id="31" Dept="Marketing">
        <Name>Robert Piccalilli</Name>
        <Grade>6</Grade>
        <Salary>18000</Salary>
    </Member>
</Staff>


The above file is used for the first few examples. However, instead, you can use the following fuller version (with addresses) for all of the examples:

StaffAdd.xml
<?xml version="1.0"?>
<Staff>
    <Member Id="21" Dept="Marketing">
        <Name>Bob Bones</Name>
        <Grade>6</Grade>
        <Salary>20000</Salary>
        <Address Type="Postal">
            <Address1>co M Muggins</Address1>
            <Address2>25A Greene Lane</Address2>
            <City>London</City>
        </Address>
    </Member>
    <Member Id="23" Dept="Sales">
        <Name>Mary Muggins</Name>
        <Grade>6</Grade>
        <Salary>22000</Salary>
        <BonusRate>0.03</BonusRate>
        <Address Type="Home">
            <Address1>25A Greene Lane</Address1>
            <City>London</City>
        </Address>
    </Member>
    <Member Id="24" Dept="Sales">
        <Name>Liz Elbow</Name>
        <Grade>5</Grade>
        <Salary>22500</Salary>
        <BonusRate>0.02</BonusRate>
        <Address Type="Home">
            <Address1>133 Waterloo Road</Address1>
            <City>London</City>
        </Address>
    </Member>
    <Member Id="27" Dept="Accounts">
        <Name>Dave Diddly</Name>
        <Grade>4</Grade>
        <Salary>28000</Salary>
        <Address Type="Home">
            <Address1>12B Baker Street</Address1>
            <City>London</City>
        </Address>
    </Member>
    <Member Id="28" Dept="Marketing">
        <Name>Mary Pickles</Name>
        <Grade>4</Grade>
        <Salary>27000</Salary>
        <BonusRate>0.025</BonusRate>
        <Address Type="Home">
            <Address1>12 Coven Road</Address1>
            <City>Ipswich</City>
        </Address>
    </Member>
    <Member Id="31" Dept="Marketing">
        <Name>Robert Piccalilli</Name>
        <Grade>6</Grade>
        <Salary>18000</Salary>
        <Address Type="Home">
            <Address1>400 Albert Drive</Address1>
            <City>London</City>
        </Address>
    </Member>
</Staff>


So the first line of your new console application should be either
XElement root = XElement.Load(@"C:\Users\Andrew\Documents\Staff.xml");

or
XElement root = XElement.Load(@"C:\Users\Andrew\Documents\StaffAdd.xml");

You also need:
using System.Xml.Linq;

Here's the full list:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;      // for StringBuilder
using System.Xml.Linq;


Find Element(s) With a Specific Attribute-Value

Using the first XML document, find all staff members in the "Marketing" department:
    XElement root = XElement.Load(@"C:\Users\Andrew\Documents\Staff.xml");

    IEnumerable<XElement> marketing =
        from el in root.Elements("Member")
        where (string)el.Attribute("Dept") == "Marketing"
        select el;

    foreach (XElement el in marketing)
        Console.WriteLine(el);

    //<Member Id="21" Dept="Marketing">
    //  <Name>Bob Bones</Name>
    //  <Grade>6</Grade>
    //  <Salary>20000</Salary>
    //</Member>
    //<Member Id="28" Dept="Marketing">
    //  <Name>Mary Pickles</Name>
    //  <Grade>4</Grade>
    //  <Salary>27000</Salary>
    //  <BonusRate>0.025</BonusRate>
    //</Member>
    //<Member Id="31" Dept="Marketing">
    //  <Name>Robert Piccalilli</Name>
    //  <Grade>6</Grade>
    //  <Salary>18000</Salary>
    //</Member>


The result of the LINQ expression is an enumerable. It is a collection of XElements that can be iterated.

Each of these XElements can also be a tree-structure, containing other elements (nodes) and/or attributes. Because of this it is not essential for a single LINQ expression to extract all of the values (attribute-values or the content of nodes) that you require, in one pass.




In XML the terms element and node are used almost interchangeably. 'Grade' is a node within the document structure; technically, a text node, as opposed to an attribute node or element node (which can have children). An element refers to everything between a start and end tag, including these tags.

I wouldn't worry about this too much. (Interestingly, the W3C Spec for XML barely mentions the word 'node', but they have been over-ruled by common, and accepted, usage.)




Find Element(s) Having a Child Element With a Specific Value

Find all the staff members of Grade 6:
    XElement root = XElement.Load(@"C:\Users\Andrew\Documents\Staff.xml");

    IEnumerable<XElement> grades6 =
        from el in root.Elements("Member")
        where (int)el.Element("Grade") == 6
        select el;

    foreach (XElement el in grades6)
        Console.WriteLine((string)el.Attribute("Id"));
    // 21, 23, 31


Note the distinction between the two methods Elements() and Element(). It helps, of course, if you know the XML structure in advance!

Find Descendants With a Specific Element Name

Collect all the staff members' names into a string:
    IEnumerable<string> names =
        from seg in root.Descendants("Name")
        select (string)seg;

    string str = names.Aggregate(new StringBuilder(),
        (sb, i) => sb.Append(i + ", "),
        sp => sp.ToString()
    );

    Console.WriteLine(str);
    // Bob Bones, Mary Muggles, etc..


XContainer.Descendants Method :MSDN

MS said:

Returns a collection of the descendant elements for this document or element, in document order.


The difference between Elements() and Descendants() is that Elements() only looks for child (first-descendant) elements, Descendants() looks deep into the element-tree. Use Elements where possible, and appropriate, as Descendants will be substantially slower for a very large document.

Sorting Elements

The following example, and the ones that follow, use the fuller XML document StaffAdd.xml.

Sort the staff: first by their Address's City, then in descending order of their Salary:
    var sortedElements =
        from el in root.Elements("Member")
        orderby (string)el.Element("Address").Element("City"),
                (decimal)el.Element("Salary") descending
        select new {
            StaffName = (string)el.Element("Name"),
            City = (string)el.Element("Address").Element("City"),
            Salary = (decimal)el.Element("Salary")
        };

    foreach (var el in sortedElements)
        Console.WriteLine("Name: {0} City: {1} Salary: {2:C0}",
            el.StaffName, el.City, el.Salary);

    //Name: Mary Pickles City: Ipswich Salary: 27,000
    //Name: Dave Diddly City: London Salary: 28,000
    //Name: Liz Elbow City: London Salary: 22,500
    //Name: Mary Muggins City: London Salary: 22,000
    //Name: Bob Bones City: London Salary: 20,000
    //Name: Robert Piccalilli City: London Salary: 18,000


Note that this example selects an anonymous type, rather than the elements themselves.

Calculations, and Filtering an Optional Element

Calculate, and return, Salary * BonusRate for each staff member, but only if they have a BonusRate:
    IEnumerable<decimal> bonuses =
        from bonusEl in root.Elements("Member").Elements("BonusRate")
        let bonus = (decimal)bonusEl.Parent.Element("Salary") * (decimal)bonusEl
        where bonus >= 200
        orderby bonus
        select bonus;

    foreach (decimal bonus in bonuses)
        Console.WriteLine(bonus.ToString("C0"));
    // 450, 660, 675


The from clause ensures that we do not receive an Exception ("value cannot be null") if we attempt to multiply by a non-existent BonusRate.

Note that we are enumerating all the BonusRate nodes, so we use the Parent property to step-back to the Member, and locate the member's Salary.

This example also demonstrates returning an enumerable of a certain type, decimal in this case. We can use the enumerable method (operator) Sum() on this query:
    decimal totBonus =
        (from bonusEl in root.Elements("Member").Elements("BonusRate")
        let bonus = (decimal)bonusEl.Parent.Element("Salary") * (decimal)bonusEl
        where bonus >= 200
        orderby bonus
        select bonus).Sum();

    Console.WriteLine("{0:C0}", totBonus);       // 1,785


Complex Filtering

Find all the staff members from the Marketing department whose Home address is in London (there is only one):
    IEnumerable<XElement> marketingHomeLondon =
        from el in root.Elements("Member")
        where (string)el.Attribute("Dept") == "Marketing" &&
            (from add in el.Elements("Address")
             where
                 (string)add.Attribute("Type") == "Home" &&
                 (string)add.Element("City") == "London"
             select add)
            .Any()          // any elements match?
        select el;

    foreach (XElement el in marketingHomeLondon)
        Console.WriteLine((string)el.Attribute("Id"));  // 31


This also demonstrates nesting a query expression, and using the Any() method. It is useful to notice that the nesting of the query expression mirrors the nested structure of the XML document. This is useful, but not essential. In particular, using Descendants() is a way to flatten the structure.

Elements Based On Context

We can select elements based on their following, or preceding, siblings, or child or ancestor elements.

Find all Address1 nodes that are immediately followed by a City node:
    // (Address1 immediately followed by City)
    IEnumerable<XElement> items =
        from el in root.Descendants("Address1")
        let z = el.ElementsAfterSelf().FirstOrDefault()
        where z != null && z.Name.LocalName == "City"
        select el.Parent;

    foreach (XElement el in items)
        Console.WriteLine("id = {0}", (string)el.Parent.Attribute("Id"));
    // everyone except Bob Bones


Bob Bones is the only staff member whose Address1 and City are separated, by an Address2.

ElementsAfterSelf() is one of many XContainer Methods.

Is This A Good Question/Topic? 0
  • +

Page 1 of 1