9 Replies - 1216 Views - Last Post: 06 May 2012 - 11:59 AM Rate Topic: -----

#1 Auhn  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 05-May 12

Jump 3 lines in list

Posted 06 May 2012 - 01:44 AM

Hello!

I'm pretty new to VB.NET and I'm still learning, please help me understand this endeavor or mine.

I have written code to download a webpage HTML source code into a .txt file, once downloaded, the code I posted here adds all lines as strings into a list and is supposed to find an anchor point as a position reference and then jump down the list to the real keyword.

I have an anchor point in this list that is a constant that I will always "know of" but I need to jump down 3 lines below this anchor point to the real keyword I'm looking for that is always "unknown to me" since it's a variable.

My anchor point is the keyword Hair Color which both the words and the line they're on are unique in the entire source code.
My target keyword is the variable Brown  (in this case), that I can't target directly because neither it nor the line it's on are unique and will change from query to query.
The HTML table source code like the one below is always structured the same but some pages may vary in layout depending on the keywords applicable on the webpage. The code structure below however is always a constant, you can always count on all the webpages that the keyword Hair Color is always the 3 line into the table row and its variable is always 3 lines below that. So even though the answer I'm looking for is always 3 lines below my anchor point, my anchor point will not always be on the same line of code in the source code.

HTML code in the .txt file:
	<tr>
		<td class="paramname">
			<b>Hair Color</b>
		</td>
		<td class="paramvalue">
			Brown&nbsp;
		</td>
	</tr>



I'm having a great deal of trouble trying to figure out just how I'm to "jump down" 3 lines of strings in my list to my target keyword.

My code thus far:
Public Sub TestSub()
        ' Creates sr as StreamReader for the .txt file with the HTML code
        Dim sr As System.IO.StreamReader = New System.IO.StreamReader("C:\TEST\downloaded.txt")

        ' Creates the list
        Dim lines As New List(Of String)

        ' Adds the lines of HTML code to the list
        While Not sr.EndOfStream
            lines.Add(sr.ReadLine)
        End While

        ' Loops through the list in search of my anchor point Hair Color
        Dim FoundIt As String = ""
        Dim line As String = ""
        For Each line In lines
            If line.Contains("Hair Color") Then

            ' Now I've found the anchor point and if I output the following:
            FoundIt = line
            msgbox(FoundIt)
            ' Then it will correctly print out the entire Hair Color html line of code.
            ' This is the point where I need to jump down 3 lines to Brown&nbsp; and target
            ' what ever is on that line.

            End If
        Next


Since I'm pretty new to this I may not be using the code correctly, through my attempts I don't know how to do that jump.

Can someone please help me understand how to "skip" or "jump" down lines in my list by predefinition like:
  • "ok, I found my anchor point, now I'm supposed to jump down 3 lines and pass what ever I find there to a string!"
  • "ok, I found my anchor point, now I'm supposed to go to the second instance of the keyword Brown from here that I find and pass what ever I find there to a string!"


Any help is greatly appreciated!

Is This A Good Question/Topic? 0
  • +

Replies To: Jump 3 lines in list

#2 DimitriV  Icon User is offline

  • They don't think it be like it is, but it do
  • member icon

Reputation: 584
  • View blog
  • Posts: 2,738
  • Joined: 24-July 11

Re: Jump 3 lines in list

Posted 06 May 2012 - 01:46 AM

Ok, you can do something similar here:
For i As Integer = 0 To lines.Count - 1 Step 3 'add 3 each time the loop runs
'do stuff here
Next


Hope this helps with your problem, Auhn!
Was This Post Helpful? 1
  • +
  • -

#3 DimitriV  Icon User is offline

  • They don't think it be like it is, but it do
  • member icon

Reputation: 584
  • View blog
  • Posts: 2,738
  • Joined: 24-July 11

Re: Jump 3 lines in list

Posted 06 May 2012 - 01:54 AM

But we'd need to also check if the chunk is a name or a value. I think we can do it like this:
If lines(i - 1).Contains("paramvalue") Then
'set value
ElseIf lines(i - 1).Contains("paramname") Then
'set name
End If


Was This Post Helpful? 1
  • +
  • -

#4 Auhn  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 05-May 12

Re: Jump 3 lines in list

Posted 06 May 2012 - 04:47 AM

Thank you for your reply DimitriV!

If I understand you correctly, I should be able to use it like this:
Imports System.IO
Public Module TestModule

    Public Sub TestSub()

        ' Creates sr as StreamReader for the .txt file with the HTML code
        Dim sr As System.IO.StreamReader = New System.IO.StreamReader("C:\TEST\downloaded.txt")

        ' Creates the list
        Dim lines As New List(Of String)

        ' Adds the lines of HTML code to the list
        While Not sr.EndOfStream
            lines.Add(sr.ReadLine)
        End While

        ' Loops through the list in search of my anchor point Hair Color
        Dim FoundIt As String = ""
        Dim line As String = ""

        For Each line In lines
            If line.Contains("Hair Color") Then

                For i As Integer = 0 To lines.Count - 1 Step 3
                    If line(i - 1).Contains("paramvalue") Then
                        ' Now my target keyword should be selected and I should be able to assign it to a string
                        FoundIt = line
                        MsgBox("Congratulations, you found the keyword: " & FoundIt)
                    ElseIf line(i - 1).Contains("paramname") Then
                        ' This is my anchor point and I should have no use for it
                    End If
                Next

            End If
        Next
    End Sub
End Module


Either I didn't understand the code correctly or I may be doing something wrong because when I run this code I receive an error on line 25:

Quote

If line(i - 1).Contains("paramvalue") Then

The error message is:

Quote

ArgumentOutOfRangeException was unhandled
Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index


I can only ever run that code if I change the variable i from negative:
For i As Integer = 0 To lines.Count - 1 Step 3
                    If line(i - 1).Contains("paramvalue") Then

                    ElseIf line(i - 1).Contains("paramname") Then
                        
                    End If
                Next


to positive:
For i As Integer = 0 To lines.Count - 1 Step 3
                    If line(i + 1).Contains("paramvalue") Then

                    ElseIf line(i + 1).Contains("paramname") Then
                        
                    End If
                Next


But then the problem is that the selection goes the wrong way not to mention I find myself in an endless loop because of that. The output for the variable FountIt in the message box is always the same, it's the HTML line of code for my anchor point Hair Color so it's like it doesn't progess beyond that :dontgetit:

Am I doing something wrong?
Was This Post Helpful? 0
  • +
  • -

#5 CharlieMay  Icon User is offline

  • This space intentionally left blank
  • member icon

Reputation: 1605
  • View blog
  • Posts: 5,162
  • Joined: 25-September 09

Re: Jump 3 lines in list

Posted 06 May 2012 - 05:29 AM

have you tried not starting your loop at 0? 0 - 1 = -1 so it will be out of range on the first iteration.

Since you don't need the first line, start at 2 which is the first line you want to find, then the i-1 will check against the line above it.

This post has been edited by CharlieMay: 06 May 2012 - 05:29 AM

Was This Post Helpful? 1
  • +
  • -

#6 Auhn  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 05-May 12

Re: Jump 3 lines in list

Posted 06 May 2012 - 06:21 AM

Thank you for your reply CharlieMay!

Quote

have you tried not starting your loop at 0? 0 - 1 = -1 so it will be out of range on the first iteration.

Since you don't need the first line, start at 2 which is the first line you want to find, then the i-1 will check against the line above it.


I tried it just now as you suggested and indeed the problem with it being negative and out of range is now solved.

Regardless of what starting position I set the variable i as the output doesn't appear to be "moving along" down the lines in the list.

I mean that when I use a breakpoint on line 22 in my software and try to see what actually happens in the code i set a Watch (Visual Studio 2010) on the variable i and lines.Count to see if the lines are actually being counted and their current status. What is see is that the lines.Count starts at the webpage source code line 888, which is incorrect since that is the end of the webpage source code (line is simply </html>). But when I step through the loop I see the following happen:
  • i is incremented every time the loops restars.
  • lines.Count is not incremented or in any way changes from the line 888 or the end of the webpage source code.
  • Both the paramvalue and paramname conditions in the IF statement are always found to be False, thus continuing the loop and subsequently creating an eternal loop.

I suspect I'm not doing three things properly:
  • Starting the loop from my anchor point Hair Color (webpage source code line 461 or the string variable line).
  • Setting up the loop properly and stepping down the lines in the list since the current line seems to always be 888 (webpage source code). My target keyword is on webpage source code line 464.
  • I'm not passing the target keyword to my string variable FoundIt properly from the list.

Can you/anyone please offer more insight or suggestions I may try? ^^

Here is my updated source code with changes (note that I misspelled earlier in the source code. I have 2 separate declarations: line is a string variable and lines is the list of strings)

Imports System.IO
Public Module TestModule

    Public Sub TestSub()

        ' Creates sr as StreamReader for the .txt file with the HTML code
        Dim sr As System.IO.StreamReader = New System.IO.StreamReader("C:\TEST\downloaded.txt")

        ' Creates the list
        Dim lines As New List(Of String)

        ' Adds the lines of HTML code to the list
        While Not sr.EndOfStream
            lines.Add(sr.ReadLine)
        End While

        ' Loops through the list in search of my anchor point Hair Color
        Dim FoundIt As String = ""
        Dim line As String = ""

        For Each line In lines
            If line.Contains("Hair Color") Then

                For i As Integer = 1 To lines.Count - 1 Step 1

                    ' This statement is, right now, always false.
                    If lines(i - 1).Contains("paramvalue") Then
                        ' Now my target keyword should be selected and I should be able to assign it to a string
                        ' How do I pass the "current line" in the list to a string variable?
                        ' if I try FoundIt = lines I naturally receive a conversion from list error.
                        FoundIt = lines
                        MsgBox("Congratulations, you found the keyword: " & FoundIt)

                    ' This statement is, right now, always false.
                    ElseIf lines(i - 1).Contains("paramname") Then 
                        ' This is my anchor point and I should have no use for it
                    End If

                Next

            End If
        Next
    End Sub
End Module


Was This Post Helpful? 0
  • +
  • -

#7 CharlieMay  Icon User is offline

  • This space intentionally left blank
  • member icon

Reputation: 1605
  • View blog
  • Posts: 5,162
  • Joined: 25-September 09

Re: Jump 3 lines in list

Posted 06 May 2012 - 07:20 AM

I guess I'm not understanding this and I don't know if it's all the code for what I think you're doing or that I just don't understand it.

Given that the text below as been placed line by line in a collection called Lines
<tr>
	<td class="paramname">
		<b>Hair Color</b>
	</td>
	<td class="paramvalue">
		Brown&nbsp;
	</td>
</tr>

and you state this is always the format of 3 lines each I would think a simple:
  For i As Integer = 0 To lines.Count - 7 Step 7
    MessageBox.Show(lines(i+2) & " : " & lines(i + 5))
  Next
\
would iterate each block of 8 lines and give you the value of the paramname and paramvalue for each block.

I also think there is some html parser that can be used but I've not worked enough with it and would have to do some research myself.
Was This Post Helpful? 1
  • +
  • -

#8 Auhn  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 05-May 12

Re: Jump 3 lines in list

Posted 06 May 2012 - 08:33 AM

Thank you for another quick reply!

View PostCharlieMay, on 06 May 2012 - 07:20 AM, said:

Given that the text below as been placed line by line in a collection called Lines
<tr>
	<td class="paramname">
		<b>Hair Color</b>
	</td>
	<td class="paramvalue">
		Brown&nbsp;
	</td>
</tr>

The entire 888 line webpage source code (the code above in your example is but one small part in that) is placed, each line as a string, into a list called lines. So my code reads all those 888 lines from a .txt file and adds them to the list as 888 strings, one for each line. This list is "scanned through" in my code using the variable declared as line to compare the current line in the list to my keyword Hair Color. Once I've found that specific string in my list, I want to use that position in the list as a starting point for my next query to look 3 lines below the current and take what ever is in that string and pass it into my variable called FoundIt and use it for my software.

The block of code is usually but not always 8 lines as in my example. I could use the suggestion you made to search most of the webpage source code where the keyword I want is in a 8-line code block. But I'm looking for more versatility and, if possible, would want to manually go through each line starting from my own targeted anchor point and look for the target keyword since the block of code could be 8 lines, it could be 10 lines or 20 lines.

Worst case scenario I will have to code different searches for different keywords. 8-line code blocks could use one search subrutine while an 10-line code block could use another.

The code you suggested unlike my code actually does read through the list strings:
For i As Integer = 0 To lines.Count - 7 Step 7
  MessageBox.Show(lines(i+2) & " : " & lines(i + 5))
Next


But the search starts at line 0 / 888, from the beginning, then steps through all the lines in the code. This is kind of what I'm looking for except I want:
For i As Integer = 0 To lines.Count - 7 Step 7


To start on the exact same line as the one that contains Hair Color from my previous query, then I want to either (Option 1 is my current train of thought, option 2 is optional but would be a dream come true in writing my queries):
  • Step down 3 steps and pass what ever string is there to my variable FoundIt and exit the query.
  • or try as DimitriV suggested in that I step down one step at a time and compare the string in the list with the condition that the previous string in the list contains another keyword I specify, if true then pass the current string to my variable FoundIt and exit the query.
    For i As Integer = 1 To lines.Count - 1 Step 1
    If lines(i - 1).Contains("paramvalue") Then
    
    


I might be trying to overexplain things so here are images that will better explain what I mean.

Query 1:

Posted Image

Query 2:

Posted Image

Does this help to give a better idea of my problems and my headaches? :sweatdrop:
Was This Post Helpful? 0
  • +
  • -

#9 CharlieMay  Icon User is offline

  • This space intentionally left blank
  • member icon

Reputation: 1605
  • View blog
  • Posts: 5,162
  • Joined: 25-September 09

Re: Jump 3 lines in list

Posted 06 May 2012 - 09:55 AM

Let me just ask this.
Would you rather look for "Hair Color" and then find the value or would you intead like to store each paramname with its associated value?

This would keep you from having to perform a lot of "Ifs" for the various names.

Either way, for now, you can just iterate the list and add to the index how ever many lines down.

For i as Integer = 0 to lines.count -1
if lines(i).Contains("Hair Color") Then
  messagebox.show(lines(i+3))
end if
Next


Now with question above this would change. I wouldn't worry about the paramname it found, I would iterate through the lines collection and store the paramname value and the paramtype value in a dictionary so that you have a keyvalue pair (ie, find "hair color" in your dictionary would return the value blonde&nbsp; which could actually be fixed to not sow the &nbsp; of course.
so instead the code would be something like
For i as Integer = 0 to lines.count -1
if lines(i).Contains("paramname") Then
'create a routine to pass lines(i+1) to so you can strip it down to just the name without <b></b>
'also pass it lines(i+4) and strip out the &nbsp; so it only contains the value
'in that routine, add the results to a dictionary (of String, String)  
'myDictionaryVariable.Add(lines(i+1)messagebox.show(lines(i+4))
end if
Next

Now you can use the keyvaluepair to find various parameters.

This post has been edited by CharlieMay: 06 May 2012 - 09:58 AM

Was This Post Helpful? 1
  • +
  • -

#10 Auhn  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 5
  • Joined: 05-May 12

Re: Jump 3 lines in list

Posted 06 May 2012 - 11:59 AM

View PostCharlieMay, on 06 May 2012 - 09:55 AM, said:

Let me just ask this.
Would you rather look for "Hair Color" and then find the value or would you intead like to store each paramname with its associated value?

In that case I would first want to look for "Hair Color" and then find the value.

I just tried your code in my query
For i as Integer = 0 to lines.count -1
if lines(i).Contains("Hair Color") Then
  messagebox.show(lines(i+3))
end if
Next


AND IT WORKS! It does exactly what I needed it to do :rockon:
Now that I can target the value I can easily clean the strings from spaces and &nbsp; etc using Replace():
' Cleaning an AnchorKeyword variable as an example:
' Raw string from query: "        <b>Hair Color</b>"
AnchorKeyword = Replace(AnchorKeyword, "        <b>", "")
AnchorKeyword = Replace(AnchorKeyword, "</b>", "")
' Cleaned AnchorKeyword string: "Hair Color"

' Cleaning the FoundIt target variable:
' Raw string from query: "        Brown&nbsp;"
FoundIt = Replace(FoundIt, " ", "")
FoundIt = Replace(FoundIt, "&nbsp;", "")
'Cleaned FoundIt string: "Brown"


Now I can successfully retrieved the variable I need as well as the anchor key should I want to use it.

I'm not quite sure how to work with dictionaries and since you helped me get the other way to work I think I'll let the dictionaries wait until another day when I need them :sweatdrop:

I can say that my problem is now solved!

Thank you all for your help and special thanks to you CharlieMay! :clap:

I will post the final source code momentarily.

This is the final source code I will use for the query of an already downloaded HTML source code:

Imports System.IO
Public Module SearchQuery

    Public Sub SearchQuerySub()

        ' Creates sr as StreamReader for the .txt file with the HTML code
        Dim sr As System.IO.StreamReader = New System.IO.StreamReader("C:\TEST\DownloadedSourceCode.txt")

        ' Creates the list
        Dim lstLines As New List(Of String)

        ' Adds the lines of HTML code to the list
        While Not sr.EndOfStream
            lstLines.Add(sr.ReadLine)
        End While

        ' Creates the placeholder variables that will handle the query.
        ' Loops through the list in search of the anchor point Hair Color.
        ' Can be replaced with any variable from a text box to increase dynamics.
        Dim strQueryResult As String = ""
        Dim strLine As String = ""
        For Each strLine In lstLines
            If strLine.Contains("Hair Color") Then

                ' After the anchor point is located, it will pass
                ' the string 3 lines down from the anchor point
                ' to the variable strQueryResult
                For i As Integer = 0 To lstLines.Count - 1
                    If lstLines(i).Contains("Hair Color") Then
                        strQueryResult = lstLines(i + 3)
                    End If
                Next
            End If
        Next

        ' Clean the strQueryResult from known junk
        strQueryResult = Replace(strQueryResult, "	", "")
        strQueryResult = Replace(strQueryResult, "&nbsp;", "")

        ' Now you're done and can do what ever you want with the variable in the software
        MsgBox(strQueryResult)

    End Sub
End Module


Was This Post Helpful? 0
  • +
  • -

Page 1 of 1