2 Replies - 185 Views - Last Post: 15 December 2013 - 12:55 PM Rate Topic: -----

#1 Jason1  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 14-December 13

Can't get mechanize to scrape multiple items. Getting undefined me

Posted 14 December 2013 - 10:25 PM

So, my question is, how do I scrape a list of items nested in a scrolldown menu?

To help contextualize, here is the chunk of the view source that I am trying to scrape from :

    <!-- mp_trans_schedule_disable_start -->
    <select name="confirm1$ddlLeavingFromMap" onchange="javascript:setTimeout('__doPostBack(\'confirm1$ddlLeavingFromMap\',\'\')', 0)" id="confirm1_ddlLeavingFromMap" class="input">
    		<option selected="selected" value="-1">Select</option>
    		<option value="429">Beamsville, ON</option>
    		<option value="438">Belleville, ON</option>
    		<option value="277">Brockville, ON</option>
    		<option value="273">Buffalo Airport, NY</option>
    		<option value="95">Buffalo, NY</option>
    		<option value="436">Burlington, ON</option>
    		<option value="424">Cambridge, ON</option>
    		<option value="440">Cobourg, ON</option>
    		<option value="278">Cornwall, ON</option>
    		<option value="434">Fort Erie, ON</option>
    		<option value="428">Grimsby, ON</option>
    		<option value="426">Hamilton GO Centre, ON</option>
    		<option value="425">Hamilton McMaster University, ON</option>
    		<option value="276">Kingston, ON</option>
    		<option value="279">Kirkland, PQ</option>
    		<option value="423">Kitchener, ON</option>
    		<option value="435">Mississauga, ON</option>
    		<option value="280">Montreal, PQ</option>
    		<option value="437">Napanee, ON</option>
    		<option value="124">Niagara Falls, ON</option>
    		<option value="449">Niagara Fallsview Casino, ON</option>
    		<option value="431">Oakville, ON</option>
    		<option value="433">Port Colborne, ON</option>
    		<option value="274">Scarborough, ON</option>
    		<option value="427">St Catharines, ON</option>
    		<option value="448">St. Catharines Brock University, ON</option>
    		<option value="315">TC Kingston</option>
    		<option value="310">Toronto Airport, ON</option>
    		<option value="145">Toronto, ON</option>
    		<option value="439">Trenton, ON</option>
    		<option value="422">Waterloo, ON</option>
    		<option value="432">Welland, ON</option>
    		<option value="275">Whitby, ON</option>
    
    	</select>
    						<!-- mp_trans_schedule_disable_end -->



I tried to focus on the css selector that is responsible for choosing an option, as well as, the option tag itself : `puts agent.page.parser.css("select").text` & `puts agent.page.parser.css("option").text` but both outputs turned up `nil`.

I then tried :

puts agent.page.parser.css("confirm1$ddlLeavingFromMap").text
and
form.field_with(:name => 'confirm1$ddlLeavingFromMap').options[1].click


Which also turned up nil.

and also this :

    require 'htmlentities'
    require "mechanize"
    a = Mechanize.new { |agent|
        agent.user_agent_alias = 'Mac Safari'
    }
    @resultHash = {}
    
    a.get("http://ca.megabus.com/BusStops.aspx") do |page|
        parsedPage = page.parser
        @resultHash[:some_data_name] = parsedPage.at_xpath("//h3[@class='right_col']").text.split(/\s+/).join(" ")
    end


For this one, when I check to see if it turns up valid using
rake -T -A
, I get
undefined method text for nil:NilClass
. I do not know why.

I would be appreciative of any feedback I can get. Thank you in advance!

Is This A Good Question/Topic? 0
  • +

Replies To: Can't get mechanize to scrape multiple items. Getting undefined me

#2 Jason1  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 14-December 13

Re: Can't get mechanize to scrape multiple items. Getting undefined me

Posted 14 December 2013 - 10:33 PM

By the way, I am a newbie and do not know how to edit my post but I wanted to add that I am using Rails 4 with the Mechanize gem.
Was This Post Helpful? 0
  • +
  • -

#3 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 8401
  • View blog
  • Posts: 31,257
  • Joined: 12-June 08

Re: Can't get mechanize to scrape multiple items. Getting undefined me

Posted 15 December 2013 - 12:55 PM

Per the terms of their site - scraping is a no-no.

Quote

J. You are solely responsible for any and all of your acts and omissions that occur when using the website, and you agree not to engage in unacceptable use of the website, which includes, without limitation, use of the website to:
...
(v) engage in systematic retrieval of data or other content from this website to create or compile, directly or indirectly, a collection, compilation, database or directory without written permission from megabus.com by use of scrapers or other tools; or


http://ca.megabus.com/terms.aspx

I will ask you do you not persist in asking for help scraping this site, and will be closing the topic. If you have questions on the 'why' feel free to shoot me a PM.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1