14 Replies - 1737 Views - Last Post: 27 February 2012 - 10:45 PM Rate Topic: -----

#1 jdavi134  Icon User is offline

  • D.I.C Head

Reputation: 42
  • View blog
  • Posts: 225
  • Joined: 26-October 11

only reading in first line of input

Posted 27 February 2012 - 09:08 PM

Hello. I am working on a project for my cs250 class and I cannot figure out where I am going wrong. The program is meant to read in a set number of URL's and count the hits. After counting all of the hits, it prints them out to a file. But When I try to run the program. It seems that it only does the first line. I must use an array of structs in order to do this. The struct include both the URL and the number of hits for the said URL. It also says that I may need to make my own addInOrder and binarySearch Methods. I am pretty sure that my binary search method works, but I have concern over my addInOrder method.


Here is my code:

counthits.cpp(main driver program)
#include <cstdlib>
#include <iostream>
#include <string>
#include <fstream>


#include "histogram.h"


using namespace std;


// Read log entries from the provided input stream, extracting the request
// portion from each. If the request is a GET, extract the locator (throwing
// away any part after a '?') and update a count of how many times that locator
// has appeared.
//
// Write the resulting wlocators and counts (in alphabetic order by locator)
//  in CSV format to the output stream.
//
// - Assume that the input contains a maximum of MaxPages distinct locators.
//   If more distinct locators than this are actually
//   encountered, write nothing to the output stream but print an error
//   message on the standard error stream.
void histogram(const int MaxPages, istream& input, ostream& output);



int main (int argc, char** argv)
{
  if (argc != 4)
    {
      cerr << "Usage: " << argv[0] << " MaxPages inFileName outFileName" << endl;
      return -1;
    }

  // Get MaxPages from the command line
  int MaxPages = atoi(argv[1]);

  // Take input and output file names from the command line
  ifstream in (argv[2]);
  ofstream out (argv[3]);
  histogram (MaxPages, in, out);
  in.close();
  out.close();

  return 0;
}



histogram.cpp (The part that needs fixing is at the bottom of this function)
#include "histogram.h"
//#include "arrayUtils.h"
#include <string>


using namespace std;




struct CountedLocations {
    string Location;
    int numHits;
};








// Extract the quoted request from a line containing a log entry
string extractTheRequest(string logEntry)
{
  string::size_type requestStart = logEntry.find('"');
  string::size_type requestEnd = logEntry.rfind('"');
  if (requestStart != requestEnd)
    {
      return logEntry.substr(requestStart+1, requestEnd-requestStart-1);
    }
  else
    return "";
}


template <typename T>
void addElement (T* array, int& size, int index, T value)
{
  // Make room for the insertion
  int toBeMoved = size - 1;
  while (toBeMoved >= index) {
    array[toBeMoved+1] = array[toBeMoved];
    --toBeMoved;
  }
  // Insert the new value
  array[index] = value;
  ++size;
}




int binarySearch(const CountedLocations list[], int listLength, string searchItem)
{
    int first = 0;
    int last = listLength - 1;
    int mid;
    string midFind;


    bool found = false;


    while (first <= last && !found)
    {
        mid = (first + last) / 2;
        midFind = list[mid].Location;
        if (midFind == searchItem)
            found = true;
        else
            if (searchItem < midFind)
                last = mid - 1;
            else
                first = mid + 1;
    }


    if (found)
        return mid;
    else
        return -1;
}








// Check to see if this is a GET request
bool isAGet (string request)
{
  return request.size() > 3 && request.substr(0,4) == "GET ";
}




// Extract the locator part of a GET request
string extractLocator (string request)
{
  // strip off the GET
  string::size_type locatorStart = 3;


  // Skip any blanks
  while (request[locatorStart] == ' ')
    ++ locatorStart;


  // The locator ends at the first blank or ? after that.
  string::size_type locatorEnd = locatorStart;
  while (request[locatorEnd] != ' ' && request[locatorEnd] != '?')
    ++ locatorEnd;


  string locator = request.substr(locatorStart, locatorEnd-locatorStart);
  return locator;
}


//template <typename T>
int addInOrder (CountedLocations array[], int& size, string value)
{
  // Make room for the insertion
  int toBeMoved = size - 1;
  while (toBeMoved >= 0 && value < array[toBeMoved].Location) {
    array[toBeMoved+1] = array[toBeMoved];
    --toBeMoved;
  }
  // Insert the new value
  array[toBeMoved+1].Location = value;
  ++size;
  return toBeMoved+1;
}






// Read log entries from the provided input stream, extracting the request
// portion from each. If the request is a GET, extract the locator (throwing
// away any part after a '?') and update a count of how many times that locator
// has appeared.
//
// Write the resulting wlocators and counts (in alphabetic order by locator)
//  in CSV format to the output stream.
//
// - Assume that the input contains a maximum of MaxPages distinct locators.
//   If more distinct locators than this are actually
//   encountered, write nothing to the output stream but print an error
//   message on the standard error stream.
void histogram(const int MaxPages, istream& input, ostream& output)
{
    // Step 1 - set up the data
    // Data should be stored in two parallel arrays, locators and counts.
    // locators[i] will be the i_th locator string and counts[i] will be the
    // count of how many times that particular locator has been requested.

    CountedLocations C [MaxPages];
    int nLocators = 0;


    // Step 2 - read and count the requested locators
    string logEntry;
    getline (input, logEntry);
    while (input)
    {
        
        string request = extractTheRequest(logEntry);
        if (isAGet(request))
        {
            string locator = extractLocator(request);
            int position = binarySearch (C, nLocators, locator);
            if (position >= 0)
            {
                // We found this locator already in the array.
                // Increment its count
                C[position].numHits+=1;
            }
            else
            {
                if (nLocators < MaxPages)
                {
                    // This is a new locator. Add it.
                    position = addInOrder (C, nLocators, locator);
                    // And add a count of one at the corresponding position
                    // in the counts array
                    //int nCounts = nLocators - 1;
                    C[position].numHits = 1;
                    //addElement (CountedLocations, nCounts, position, 1);
                }
                else
                {
                    // Not enough room in the arrays
                    ++nLocators;
                }
            }
        }
        getline (input, logEntry);
    }


    // Step 3 - write the output report
    string Location;
    int Hits;
    if (nLocators <= MaxPages)
    {
        for (int i = 0; i < nLocators; ++i)
        {
            Location = C[i].Location;
            Hits = C[i].numHits;
        }
        output << "\"" << Location << "\"," << Hits << endl;
    }
    else
    {
        cerr << "Input file contains more than " << MaxPages << " locators." << endl;
    }
  // Step 4 - cleanup
}



This is my input:
70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /cocoon/~cs330web/forum/show/_axle/private.gif HTTP/1.1" 302 -
70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /cocoon/~cs330web/forum/show/_axle/feed-icon16x16.png HTTP/1.1" 302 -
70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /axle/Forum/_axle/private.gif HTTP/1.1" 304 -
70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /axle/Forum/fckeditor/editor/fckeditor.html?InstanceName=text&Toolbar=ForumToolbar HTTP/1.1" 304 -
70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /axle/Forum/fckeditor/fckconfig.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Mar/2008:23:31:28 -0500] "GET / HTTP/1.0" 200 2499
127.0.0.1 - - [30/Mar/2008:23:31:29 -0500] "GET / HTTP/1.0" 200 2499

Is This A Good Question/Topic? 0
  • +

Replies To: only reading in first line of input

#2 bodom658  Icon User is offline

  • Villiage Idiom
  • member icon

Reputation: 113
  • View blog
  • Posts: 1,123
  • Joined: 22-February 08

Re: only reading in first line of input

Posted 27 February 2012 - 09:20 PM

Can you elaborate on 'only does the first line'? What kind of debugging have you attempted?
Was This Post Helpful? 1
  • +
  • -

#3 jdavi134  Icon User is offline

  • D.I.C Head

Reputation: 42
  • View blog
  • Posts: 225
  • Joined: 26-October 11

Re: only reading in first line of input

Posted 27 February 2012 - 09:24 PM

View Postbodom658, on 27 February 2012 - 09:20 PM, said:

Can you elaborate on 'only does the first line'? What kind of debugging have you attempted?



The only does the first line means that it seems that the program only does the first line of the input. As in after it reads in, counts the hits, and outstreams the first line of the input, it just stops. Is there something wrong with my while loop used in the histogram function?

And this is to be executed through the use of the command line, and debugging he has not taught us yet.
Was This Post Helpful? 0
  • +
  • -

#4 bodom658  Icon User is offline

  • Villiage Idiom
  • member icon

Reputation: 113
  • View blog
  • Posts: 1,123
  • Joined: 22-February 08

Re: only reading in first line of input

Posted 27 February 2012 - 09:37 PM

Debugging in this case can be very simple, as simple as a print statement in your while loop to indicate each pass and maybe print the parsed request string.

You seem to be utilizing getline correctly, though traditionally it would be recommended to utilize the member function of the istream object itself, i.e. input.getline(logEntry);

Perhaps you can try some of the above and check your input files (paying attention to line endings perhaps). Remember to not try to do everything at once though.

EDIT: See the post below mine for a good example, this is sometimes known as the "RMS style debugging" though, in the long run, I'd recommend learning how to use GDB

This post has been edited by bodom658: 27 February 2012 - 09:45 PM

Was This Post Helpful? 1
  • +
  • -

#5 jimblumberg  Icon User is offline

  • member icon


Reputation: 4098
  • View blog
  • Posts: 12,682
  • Joined: 25-December 09

Re: only reading in first line of input

Posted 27 February 2012 - 09:44 PM

Have you tried printing out your data a different parts of your program using cout? For instance in your histogram function:
    string logEntry;
    getline (input, logEntry);
    while (input)
    {
       std::cout << logEntry << std::endl; // Add this to see if you read the file properly.

        string request = extractTheRequest(logEntry);
        std::cout << request << std::endl;  // Add this to see if you extracted the request properly.




Using cout is a simple debugging tool that can help locate where the problem occurs.

View Postbodom658, on 27 February 2012 - 10:37 PM, said:

You seem to be utilizing getline correctly, though traditionally it would be recommended to utilize the member function of the istream object itself, i.e. input.getline(logEntry);


There are two versions of getline() one that works with std::string and one that works with a C-string. You seem to be referring to the C-string version although you have left off the size of the C-string. The OP is using the correct version for std::strings.

Jim

This post has been edited by jimblumberg: 27 February 2012 - 09:45 PM

Was This Post Helpful? 1
  • +
  • -

#6 jdavi134  Icon User is offline

  • D.I.C Head

Reputation: 42
  • View blog
  • Posts: 225
  • Joined: 26-October 11

Re: only reading in first line of input

Posted 27 February 2012 - 09:45 PM

View Postjimblumberg, on 27 February 2012 - 09:40 PM, said:

Have you tried printing out your data a different parts of your program using cout? For instance in your histogram function:
    string logEntry;
    getline (input, logEntry);
    while (input)
    {
       std::cout << logEntry << std::endl; // Add this to see if you read the file properly.

        string request = extractTheRequest(logEntry);
        std::cout << request << std::endl;  // Add this to see if you extracted the request properly.




Using cout is a simple debugging tool that can help locate where the problem occurs.

Jim



I added these and have determined that it is my while loop that is not working. It still only prints out the first line that the input has.

And now I actually get this compile error. actually this came in before adding your suggestions.

c:\program files (x86)\codeblocks\mingw\bin\..\lib\gcc\mingw32\4.4.1\..\..\..\libmingw32.a(main.o):main.c|| undefined reference to `WinMain@16'|
||=== Build finished: 1 errors, 0 warnings ===|

This post has been edited by jdavi134: 27 February 2012 - 09:46 PM

Was This Post Helpful? 0
  • +
  • -

#7 jimblumberg  Icon User is offline

  • member icon


Reputation: 4098
  • View blog
  • Posts: 12,682
  • Joined: 25-December 09

Re: only reading in first line of input

Posted 27 February 2012 - 09:52 PM

For your error message you need to insure that you are creating a console application. I believe that message indicates you are working with a win32 program.

When I added those cout statements above I did see that the loop was reading all the lines in the file. Are you sure your files are opening correctly? You should always insure that the files open correctly and if they don't inform the user.

Also this line:
CountedLocations C [MaxPages];

Causes an error in my compiler, C++ does not allow variable length arrays. MaxPages must be a compile time constant, you are trying to use a run time constant, C++ does not allow that.


Jim

This post has been edited by jimblumberg: 27 February 2012 - 09:56 PM

Was This Post Helpful? 1
  • +
  • -

#8 bodom658  Icon User is offline

  • Villiage Idiom
  • member icon

Reputation: 113
  • View blog
  • Posts: 1,123
  • Joined: 22-February 08

Re: only reading in first line of input

Posted 27 February 2012 - 09:53 PM

That error seems to suggest that there is a problem with your project setup (Did you do something odd to the file with your main function in it?)

And Jim, good to know about istream.getline, it's been a while since I've done file manipulation in C++ (normally, I'm working with straight C, and half the time I'm opting for things like fwrite and fread over the normal conventions)
Was This Post Helpful? 1
  • +
  • -

#9 jdavi134  Icon User is offline

  • D.I.C Head

Reputation: 42
  • View blog
  • Posts: 225
  • Joined: 26-October 11

Re: only reading in first line of input

Posted 27 February 2012 - 10:07 PM

View Postjimblumberg, on 27 February 2012 - 09:52 PM, said:

For your error message you need to insure that you are creating a console application. I believe that message indicates you are working with a win32 program.

When I added those cout statements above I did see that the loop was reading all the lines in the file. Are you sure your files are opening correctly? You should always insure that the files open correctly and if they don't inform the user.

Also this line:
CountedLocations C [MaxPages];

Causes an error in my compiler, C++ does not allow variable length arrays. MaxPages must be a compile time constant, you are trying to use a run time constant, C++ does not allow that.


Jim



View Postbodom658, on 27 February 2012 - 09:53 PM, said:

That error seems to suggest that there is a problem with your project setup (Did you do something odd to the file with your main function in it?)

And Jim, good to know about istream.getline, it's been a while since I've done file manipulation in C++ (normally, I'm working with straight C, and half the time I'm opting for things like fwrite and fread over the normal conventions)



I have fixed that error. And it now compiles without error again. But I still can't find why it does not read the rest of the input. I don't see anything wrong with it at all.
Was This Post Helpful? 0
  • +
  • -

#10 jimblumberg  Icon User is offline

  • member icon


Reputation: 4098
  • View blog
  • Posts: 12,682
  • Joined: 25-December 09

Re: only reading in first line of input

Posted 27 February 2012 - 10:09 PM

Have you checked to insure that the input file is opening correctly?

Jim
Was This Post Helpful? 1
  • +
  • -

#11 jdavi134  Icon User is offline

  • D.I.C Head

Reputation: 42
  • View blog
  • Posts: 225
  • Joined: 26-October 11

Re: only reading in first line of input

Posted 27 February 2012 - 10:12 PM

View Postjimblumberg, on 27 February 2012 - 10:09 PM, said:

Have you checked to insure that the input file is opening correctly?

Jim



Is there a way to do this?
Was This Post Helpful? 0
  • +
  • -

#12 jimblumberg  Icon User is offline

  • member icon


Reputation: 4098
  • View blog
  • Posts: 12,682
  • Joined: 25-December 09

Re: only reading in first line of input

Posted 27 February 2012 - 10:16 PM

Yes, you check the status of your stream after you try to open it.

For example when you open your files:
std::ifstream fin("YOURFILENAME");
if(!fin)
{ // File did not open.
   // tell the operator, and then you should probably quit your program.
}



Always check that the files opened correctly before you proceed.

Jim
Was This Post Helpful? 1
  • +
  • -

#13 jimblumberg  Icon User is offline

  • member icon


Reputation: 4098
  • View blog
  • Posts: 12,682
  • Joined: 25-December 09

Re: only reading in first line of input

Posted 27 February 2012 - 10:21 PM

Also note with the input you specified I get the following output:

Quote

70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /cocoon/~cs330web/forum/show/_axle/private.gif HTTP/1.1" 302 -

REQUEST GET /cocoon/~cs330web/forum/show/_axle/private.gif HTTP/1.1

70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /cocoon/~cs330web/forum/show/_axle/feed-icon16x16.png HTTP/1.1" 302 -

REQUEST GET /cocoon/~cs330web/forum/show/_axle/feed-icon16x16.png HTTP/1.1

70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /axle/Forum/_axle/private.gif HTTP/1.1" 304 -

REQUEST GET /axle/Forum/_axle/private.gif HTTP/1.1

70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /axle/Forum/fckeditor/editor/fckeditor.html?InstanceName=text&Toolbar=ForumToolbar HTTP/1.1" 304 -

REQUEST GET /axle/Forum/fckeditor/editor/fckeditor.html?InstanceName=text&Toolbar=ForumToolbar HTTP/1.1

70.161.31.80 - - [30/Mar/2008:23:31:21 -0500] "GET /axle/Forum/fckeditor/fckconfig.js HTTP/1.1" 304 -

REQUEST GET /axle/Forum/fckeditor/fckconfig.js HTTP/1.1

127.0.0.1 - - [30/Mar/2008:23:31:28 -0500] "GET / HTTP/1.0" 200 2499

REQUEST GET / HTTP/1.0

127.0.0.1 - - [30/Mar/2008:23:31:29 -0500] "GET / HTTP/1.0" 200 2499

REQUEST GET / HTTP/1.0


Jim
Was This Post Helpful? 1
  • +
  • -

#14 bodom658  Icon User is offline

  • Villiage Idiom
  • member icon

Reputation: 113
  • View blog
  • Posts: 1,123
  • Joined: 22-February 08

Re: only reading in first line of input

Posted 27 February 2012 - 10:24 PM

Take a look at where you are writing your output, in step 3. Look at where your output statement is and where you are iterating through what you want to write out.

It's a simple, silly answer, but it happens to all of us.
Was This Post Helpful? 1
  • +
  • -

#15 jdavi134  Icon User is offline

  • D.I.C Head

Reputation: 42
  • View blog
  • Posts: 225
  • Joined: 26-October 11

Re: only reading in first line of input

Posted 27 February 2012 - 10:45 PM

View Postbodom658, on 27 February 2012 - 10:24 PM, said:

Take a look at where you are writing your output, in step 3. Look at where your output statement is and where you are iterating through what you want to write out.

It's a simple, silly answer, but it happens to all of us.



Wow. You're right, that was pretty ridiculous. Lol. Didn't put the output inside of the for loop.

Thank you very much! You just made my day!
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1