9 Replies - 1425 Views - Last Post: 20 February 2011 - 06:09 AM Rate Topic: -----

#1 robert_tonnessen@hotmail.com  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 42
  • Joined: 21-November 08

A little help getting information from XML document in to an object

Posted 18 February 2011 - 03:37 PM

Hi all, I am trying to build a list of news storys from an XML document, here is what I have

I have a class named rssnews.cs it's constructor takes five strings

title
description
pubDate
Author
link

I have an XML document that has lots of news storys, like this
<item>
   <title> title here </title>
   <description> description here </description>
   <pubDate> date here </pubDate>
   <Author> john smith </Author>
   <link> http://sdasdd.com </link>
</item>
<item>
   <title> title here </title>
   <description> description here </description>
   <pubDate> date here </pubDate>
   <Author> john smith </Author>
   <link> http://sdasdd.com </link>
</item>


I need to get each one of these is to an rssnews object which inturn is in a list of rssnews objects but I just cant seem to navigate this very well, I can see them all in the debuger but i just cant get them!! it's me though they are clearly there.

So my code

List<rssnews> newsList = new List<rssnews>();

 public List<rssnews> GetRssNews (string username, string password) {

            WebClient client = new WebClient();
            CookieContainer cookie = new CookieContainer();
            client.Credentials = new NetworkCredential(username, password, "chester");
            XDocument xmlDoc = Xdocument.Parse(client.DownloadString(new Uri("https://portal.chester.ac.uk/_layouts/listfeed.aspx?List=%7B90300E78%2D1E7F%2D468D%2D923E%2D23F2E0C686E0%7D&Source=https%3A%2F%2Fportal%2Echester%2Eac%2Euk%2FLists%2FNews%2FAllItems%2Easpx",UriKind.Absolute)));

XmlNamespaceManager nsMgr = new XmlNamespaceManager(xmlDoc.NameTable);

XmlNodeList selectedNodes = xmlDoc.SelectNodes("/rss/channel/item", nsMgr);
foreach (XmlNode selectedNode in selectedNodes)
{
  // and here is where i am stuck

    I started to try this but cant figure out what code should be to get at each of the above five elements within each node to supply to my constructor!


  newsList.Add(new rssnews(){ title = selectedNode});
}



any help would be amazing!!

This post has been edited by insertAlias: 18 February 2011 - 03:43 PM


Is This A Good Question/Topic? 0
  • +

Replies To: A little help getting information from XML document in to an object

#2 JackOfAllTrades  Icon User is offline

  • Saucy!
  • member icon

Reputation: 6039
  • View blog
  • Posts: 23,436
  • Joined: 23-August 08

Re: A little help getting information from XML document in to an object

Posted 18 February 2011 - 03:57 PM

Read this tutorial; I think you'll find it very helpful.
Was This Post Helpful? 0
  • +
  • -

#3 robert_tonnessen@hotmail.com  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 42
  • Joined: 21-November 08

Re: A little help getting information from XML document in to an object

Posted 18 February 2011 - 04:10 PM

I took a look at it but dont see how it helps me? I tried this
 XDocument doc = Xdocument.Parse(client.DownloadString(new Uri("https://portal.chester.ac.uk/_layouts/listfeed.aspx?List=%7B90300E78%2D1E7F%2D468D%2D923E%2D23F2E0C686E0%7D&Source=https%3A%2F%2Fportal%2Echester%2Eac%2Euk%2FLists%2FNews%2FAllItems%2Easpx",UriKind.Absolute)));
            
XElement root = doc.Root;
            
            List<XElement> item = root.Descendants("item").ToList();
            var rList = (from items in item
                         select new
                         {
                             title = items.Attribute("title").Value,
                             description = items.Attribute("description").Value,
                             link = items.Attribute("link").Value,
                             author = items.Attribute("author").Value,
                             pubDate = items.Attribute("pubDate").Value,
                             // Modified = DateTime.Parse(row.Attribute("ows_Modified").Value)
                         }).ToList();

and then was thinking of

rssNews.Add(new rssnews(){title = title, description = description, link = link, author = author, pubDate = pubDate});

but I am just clutching at straws! oh by the way, the above give nullpointer when it hits the var rList line of code.


This post has been edited by robert_tonnessen@hotmail.com: 18 February 2011 - 04:11 PM

Was This Post Helpful? 0
  • +
  • -

#4 CodingSup3rnatur@l-360  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 991
  • View blog
  • Posts: 971
  • Joined: 30-September 10

Re: A little help getting information from XML document in to an object

Posted 18 February 2011 - 04:58 PM

Hi,

First things first, I would suggest you add a root element to the xml document if the document doesn't have one already. You could call the element 'Books' maybe.

<Books>

//the xml you posted goes here

</Books>


Now, your getting the null pointer exception because you are trying to access attributes, but your .xml document has no attributes. An example of an attribute is 'genre' in the following line:

<item genre="Sci-Fi"></item> //'genre' is an attribute of 'item'


What you want to access is elements.

Also, you are filling anonymous types with the different values. However, you stated that you wanted to fill ressnews objects. Why not just use the constructor you mentioned to build the rssnews objects?

You are very nearly there (providing 'doc' contains the valid .xml document of course); you could do something like this:


XDocument doc = Xdocument.Parse(client.DownloadString(new Uri("https://portal.chester.ac.uk/_layouts/listfeed.aspx?List=%7B90300E78%2D1E7F%2D468D%2D923E%2D23F2E0C686E0%7D&Source=https%3A%2F%2Fportal%2Echester%2Eac%2Euk%2FLists%2FNews%2FAllItems%2Easpx",UriKind.Absolute)));

//list of rssnews objects
var rList = (from items in doc.Descendants("item")
             select new rssnews(
                                items.Element("title").Value,
                                items.Element("description").Value, 
                                items.Element("pubDate").Value, 
                                items.Element("Author").Value,
                                items.Element("link").Value))
                                .ToList<rssnews>();



The nice thing is that the ToList() method will execute the query there and then, so you don't have to loop to execute the query and/or fill the list with rssnews objects. Meaning that that is all you need to generate the list of objects.

:)

This post has been edited by CodingSup3rnatur@l-360: 18 February 2011 - 05:13 PM

Was This Post Helpful? 0
  • +
  • -

#5 robert_tonnessen@hotmail.com  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 42
  • Joined: 21-November 08

Re: A little help getting information from XML document in to an object

Posted 19 February 2011 - 02:05 AM

Hi and thanks for the reply, Your explanation was very good thanks very much. But I never checked the thread until this am and ended up with the following last night, It works, but is not tidy, could you tell me how to optimise it? I am thinking about using the rssnews as the object like you said. As you can see, I had to drill a little down in to the description element and am sure thats not the right way, but it works, but I would like to do it a bit neater!

 WebClient client = new WebClient();
            CookieContainer cookie = new CookieContainer();
            client.Credentials = new NetworkCredential(username, password, "chester");
            XDocument doc = Xdocument.Parse(client.DownloadString(new Uri("https://portal.chester.ac.uk/_layouts/listfeed.aspx?List=%7B90300E78%2D1E7F%2D468D%2D923E%2D23F2E0C686E0%7D&Source=https%3A%2F%2Fportal%2Echester%2Eac%2Euk%2FLists%2FNews%2FAllItems%2Easpx",UriKind.Absolute)));
            XElement root = doc.Root;
            
            List<XElement> item = root.Descendants("item").ToList();

            for (int i = 0; i <= item.Count-1; i++) {

                List<XElement> title = item[i].Descendants("title").ToList();
                List<XElement> author = item[i].Descendants("author").ToList();
                List<XElement> link = item[i].Descendants("link").ToList();
                List<XElement> pubDate = item[i].Descendants("pubDate").ToList();
                List<XElement> description = item[i].Descendants("description").ToList();
                XDocument docz = Xdocument.Parse("<root>"+description[0].Value.ToString()+"</root>");
                XElement root2 = docz.Root;
                List<XElement> descriptiontext = root2.Descendants("p").ToList();
                

                RssNews.Add(new rssnews() {title=title[0].Value.ToString(), author=author[0].Value.ToString(), pubDate=pubDate[0].Value.ToString(), Link=link[0].Value.ToString(), description =descriptiontext[0].Value.ToString() });
            
            }
         
                return RssNews;


Was This Post Helpful? 0
  • +
  • -

#6 CodingSup3rnatur@l-360  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 991
  • View blog
  • Posts: 971
  • Joined: 30-September 10

Re: A little help getting information from XML document in to an object

Posted 19 February 2011 - 03:00 AM

What exactly are you doing with that description element actually? Oher than that part, can you not just use the code I provided in my previous post, as it seems pretty neat I think ;). If not, tell me what it doesn't do that you need it to do, and we'll take it from there :)

This post has been edited by CodingSup3rnatur@l-360: 19 February 2011 - 03:02 AM

Was This Post Helpful? 0
  • +
  • -

#7 robert_tonnessen@hotmail.com  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 42
  • Joined: 21-November 08

Re: A little help getting information from XML document in to an object

Posted 19 February 2011 - 08:55 AM

Hi CodingSup3rnatur@l-360, Yes your node is nice and neat :) what I should have made more clear is that the second lot of code i posted was written last night before i got your response, so its not your code was not neat, just I never had it when I wrote mine,

I am going to use yours but the problem I have is that in the description element are more elments and the actual info i need is in a paragraph element hence why i had that messy bit of code in my second attempt that!, so, how would I use your code, but drill down into the <p> element within the <title> element??
Was This Post Helpful? 0
  • +
  • -

#8 CodingSup3rnatur@l-360  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 991
  • View blog
  • Posts: 971
  • Joined: 30-September 10

Re: A little help getting information from XML document in to an object

Posted 19 February 2011 - 09:04 AM

Could you post an example description element with all it's children elements please, and identify what exactly you want to retrieve :) And do you want to combine all the elements within each description tag into one single string description that can be put in a rssnews object?

Thanks.
Was This Post Helpful? 0
  • +
  • -

#9 robert_tonnessen@hotmail.com  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 42
  • Joined: 21-November 08

Re: A little help getting information from XML document in to an object

Posted 19 February 2011 - 11:05 AM

Hi, here is a sample of what you get within the item element. hope this helps, from the description tag i just want a string, the string is in the <p> element within the <description> tag


 <item>
      <title>Reminder! Faculty BELL Lunchtime Research Seminars: 1st and 16th March 2011</title>
      <link>https://portal.chester.ac.uk/Lists/News/DispForm.aspx?ID=647</link>
      <description><![CDATA[<div><b>Body:</b> <div class="ExternalClassCC4D9A7A46FC467B92E27C16B550F794"><p>​Places are still available on the forthcoming seminar titled:<br /><br />&quot;Value Based Management in SMEs&quot; by Professor Bernd Britzelmaier, Professor of Controlling, Finance and Accounting, Pforzheim University, Germany<br /><br />Date: Tuesday 1st March 2011<br />Time: 12.30-2.00pm<br />Venue: CWE 218 Westminster Building. <br /><br />If you would like to book a place please contact me at sandra.carr@chester.ac.uk or 01244 511830 (ext 1830).<br /><br />For catering purposes please book the above session by Thursday 24th February 2011.<br /><br /><br />Also coming in March 2011<br /><br />&quot;Being clear about practitioner enquiry: the logic of situated knowledge generation&quot; presentation by <br />Dr. Jon Talbot, Senior Lecturer - Professional Development, Faculty of Business Enterprise &amp; Lifelong Learning.<br /><br />Date: Wednesday 16th March 2011<br />Time: 1.00-2.30pm<br />Venue: CWE 219/2 Westminster Building.<br /><br />If you would like to attend please email me at sandra.carr@chester.ac.uk or telephone 01244 511830 (ext 1830) to book a place. <br /><br />Details of more Lunchtime Seminars will follow on SharePoint.</p></div></div>
<div><b>Expires:</b> 17/03/2011</div>
]]></description>
      <author>Simon Fish</author>
      <pubDate>Fri, 18 Feb 2011 16:14:44 GMT</pubDate>
      <guid isPermaLink="true">https://portal.chester.ac.uk/Lists/News/DispForm.aspx?ID=647</guid>
    </item>



Was This Post Helpful? 0
  • +
  • -

#10 CodingSup3rnatur@l-360  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 991
  • View blog
  • Posts: 971
  • Joined: 30-September 10

Re: A little help getting information from XML document in to an object

Posted 20 February 2011 - 06:09 AM

Well, here is a possible solution I suppose:

XDocument doc = Xdocument.Parse(client.DownloadString(new Uri("https://portal.chester.ac.uk/_layouts/listfeed.aspx?List=%7B90300E78%2D1E7F%2D468D%2D923E%2D23F2E0C686E0%7D&Source=https%3A%2F%2Fportal%2Echester%2Eac%2Euk%2FLists%2FNews%2FAllItems%2Easpx",UriKind.Absolute)));

var rList = (from items in doc.Descendants("item")
             let parsedDescription = 
             Regex.Replace(Regex.Match(items.Element("description").Value, @"<p>(.+?)</p>").Groups[1].Value.Replace("<br />", Environment.NewLine), 
             @"&[^\s]*;", String.Empty)
             select new rssnews(items.Element("title").Value, parsedDescription, items.Element("pubDate").Value,
             items.Element("author").Value, items.Element("link").Value)).ToList();



That should work for the .xml you provided I think. However, bear in mind it is quite vulnerable and inflexible (but where would be the fun in it if I gave you a perfect solution ;)). Malformed html or xml could cause problems for it, for example. It is targeting the specific format you provided. It removes html entities and replaces <br /> tags with line breaks. If a pair of correctly formed 'p' tags is not found, 'description' will be an empty string. Bear in mind that you should use Regex matching very sparingly and cautiously when parsing xml/html...

Hopefully that will give you a starting point at least.

To gain real flexability and security, I think you would be better to break each stage of the parsing into separate methods so that the relevant validation and purification etc can be performed. Put this validation into a separate class. You can load raw xml into that class, parse it, purify it, and then pass the clean, valid data to a class that holds the business logic of the application. Thus keeping the messy parsing algorithms and validation well away from the core logic.

My query lays the foundations for you though, without doing the whole thing for you :). You can loop through the descendants of the 'item' tags, and go through each one by one, just like I have done in my query, but allowing for malformed tags, decoupling the parsing algorithms from this specific layout of xml a bit more etc :)

This post has been edited by CodingSup3rnatur@l-360: 20 February 2011 - 07:47 AM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1