Subscribe to C# codes        RSS Feed
***** 1 Votes

Scrape a webpage/ php script

Icon Leave Comment
You know when you right click a website to view it's source? Well you can do that in C# and save/manipulate the string however you'd like, and it's FUN! The below code illustrates.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Net;
using System.IO;

namespace WindowsFormsApplication7
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            string url = @"http://goog.com";

            string pageContents = BrowseToPage(url);

            MessageBox.Show(pageContents);
        }

        public static string BrowseToPage(string Address)
        {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Address);

            request.ContentType = "Content-type: text/xml";
            request.Method = "GET";
            request.Proxy = null;  // IF you aren't using a proxy, this expedites the initial request
            request.KeepAlive = false;

            // Fire off the request and get the response.
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            // Get the stream containing content returned by the server.
            Stream dataStream = response.GetResponseStream();
            StreamReader reader = new StreamReader(dataStream);

            string responseFromServer = reader.ReadToEnd();

            return responseFromServer;
        }
    }
}



The implications of this are that you can communicate data to and from a web server using PHP as the translator, you just need to know a little PHP. It opens the door to transferring txt files, images, MySql data, and dynamically generated data. eg string pageContents = BrowseToPage(@"http://example.com/myphpscript.php?cmd=GetHitCount");

0 Comments On This Entry

 

Trackbacks for this entry [ Trackback URL ]

There are no Trackbacks for this entry

December 2014

S M T W T F S
 123456
78910111213
14151617181920
21 222324252627
28293031