2 Replies - 603 Views - Last Post: 30 May 2017 - 07:33 AM Rate Topic: -----

#1 jstanley6  Icon User is offline

  • New D.I.C Head

Reputation: 1
  • View blog
  • Posts: 39
  • Joined: 22-January 17

Help with a web crawler

Posted 26 May 2017 - 10:53 PM

Hello I am trying to get my coding to have only a header that is clickable the Code for it either 200 or 300 and if it is an internal or external link. For example it might say like Production Support Code 200 Internal and it's not working out well at all. Here is what I have so far:

website_array = []
click_text_array = []

link.each do |links|
  website_array << links.scan(/href="(.+?)"/)
end
link.each do |links|
  click_text_array << links.scan(/>(.+?)</)
end

puts click_text_array

puts link.count


Prawn::document.generate("testing.pdf") do |pdf|

  pdf.font "Courier", :size => 24
  pdf.text "Website: #{website}"
  pdf.move_down 10
  pdf.font "Courier", :size => 10
  pdf.move_down 20
  pdf.text "#{website_array}"
  # pdf.te
  end



and then the PDF file prints out like [["https://get.adobe.com/shockwave/"]], [["https://www.adobe.com/downloads.html"]] etc..

I don't want it to look like a double array at all and I need that link to be clickable and I'm trying to figure out how to actually get all the links separated but I'm really not sure where to begin. I have a link class and a link checker class and than a webcrawler.rb file and I am requiring 2 gems, the 'net/http' and the 'prawn' only 2 gems I am allowed to use. If you can help me understand this project a little more that would be great. If you need more info I can provide that as well. Thanks!

Is This A Good Question/Topic? 0
  • +

Replies To: Help with a web crawler

#2 andrewsw  Icon User is online

  • the case is sol-ved
  • member icon

Reputation: 6374
  • View blog
  • Posts: 25,754
  • Joined: 12-December 12

Re: Help with a web crawler

Posted 27 May 2017 - 12:03 AM

Question moved to Ruby forum. Do not post questions in the Snippets section.
Was This Post Helpful? 0
  • +
  • -

#3 NeoTifa  Icon User is online

  • NeoTifa Codebreaker, the Scourge of Devtester
  • member icon





Reputation: 4081
  • View blog
  • Posts: 18,152
  • Joined: 24-September 08

Re: Help with a web crawler

Posted 30 May 2017 - 07:33 AM

pdf.text "Website: #{website}"
I do not see website defined anywhere, so it's going to be null

pdf.text "#{website_array}"
You're not iterating through the array at all, so it's going to print the object.to_s
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1