I'm trying to use VB.NET (2010) to get the absolute URLs of each image that appears on a specific webpage. So far, I've figured out how to get all of the URLs inside of a HTML <img> tag, like so...
For Each SeparateImage As HtmlElement In WebBrowser1.document.Images
ListBox1.Items.Add(SeparateImage.GetAttribute("src"))
Next
That works perfectly. But what I can't figure out is how to extract image URLs that appear within CSS styles. Like this...
Does anyone know of a simple way to do this? I would need to extract the image URLs not only from inline CSS code, but from external stylesheets as well.
I reckon that one way to do it would be to grab the source code of the entire HTML page and related CSS stylesheet, and then parse out all of the image URLs using a bunch of string splits and/or regex. But that could get pretty complicated to figure out the correct absolute URL of each image, because of all the different possibilities of "relative" URL paths I may come across. For example...
So... it would be really nice if something like this existed...
For Each CSS_Style As HtmlElement In WebBrowser1.document.Styles
ListBox1.Items.Add(CSS_Style.GetAttribute("background-image"))
Next
(I know the above code doesn't work... it's just an example). So... does anyone know how I might be able to accomplish something like that? Or have any other ideas that don't involve mind numbing amounts of regex and logic?
Thanks in advance!

New Topic/Question
Reply



MultiQuote








|