3 Replies - 245 Views - Last Post: 18 November 2011 - 06:14 AM

#1 37Liion  Icon User is offline

  • New D.I.C Head
  • member icon

Reputation: 2
  • View blog
  • Posts: 49
  • Joined: 18-February 10

Can Somebody Help Me Parse This:

Posted 17 November 2011 - 05:20 PM

<tr class="item">
      <td class="id">2257</td>
      <td class="icon sprites items-1-9-2257-0"></td>
      <td class="name"><a href="/items/2257">Green Music Disc</a></td>
    </tr>

This is one of about 200-300 entries into a table on a website.
I'm trying to get the ID, Image, and Name of every block on this list:
http://minecraft-ids...medgecombe.com/
But I've never done much CSS or web development in general.
It seems to be that 1-9 is the version number, 2257 is the item ID, and as for the 0 I have no idea.
Getting the ID and Name isn't much trouble but as for the images :S
I need the images of the blocks for my Minecraft Blueprinting program I am writing for my CSE final project.
Wasn't sure where else to put this, feel free to move it.

Is This A Good Question/Topic? 0
  • +

Replies To: Can Somebody Help Me Parse This:

#2 Martyr2  Icon User is offline

  • Programming Theoretician
  • member icon

Reputation: 4332
  • View blog
  • Posts: 12,127
  • Joined: 18-April 07

Re: Can Somebody Help Me Parse This:

Posted 17 November 2011 - 05:44 PM

What language are you using? You could pull out the data for each element using a regular expression that contains match groups, create a dom element and pull children through navigation and innerText or you can pull out elements using something like a jQuery match if you are using Javascript. These are just three ways to parse this.

All depends on what technologies you are using and what you are familiar with. :)
Was This Post Helpful? 0
  • +
  • -

#3 37Liion  Icon User is offline

  • New D.I.C Head
  • member icon

Reputation: 2
  • View blog
  • Posts: 49
  • Joined: 18-February 10

Re: Can Somebody Help Me Parse This:

Posted 17 November 2011 - 06:09 PM

I was thinking Python because that's what I've done most of my "ripping" with. The program I need the images for is written in Java. I've come to the conclusion that all the "images" are actually part of one very tall (9384 px) image. So I think I will either be needing to figure out somewhere else to get my images from or come up with a way to split the image into a bunch of 36x34 images. I'm assuming each block is 34 px tall due to the fact that 34 is the only number close to 32 which divides evenly into 9384.
Was This Post Helpful? 0
  • +
  • -

#4 Programmist  Icon User is offline

  • CTO
  • member icon

Reputation: 252
  • View blog
  • Posts: 1,833
  • Joined: 02-January 06

Re: Can Somebody Help Me Parse This:

Posted 18 November 2011 - 06:14 AM

Notepadd++ (or SCiTE) has build in regular expression search/replace. I reformat and clean up date often using this rather than writing one-off scripts to do it.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1