Page 1 of 1

File input using sentinels to control input of variable-length records Rate Topic: -----

#1 r.stiltskin  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2032
  • View blog
  • Posts: 5,435
  • Joined: 27-December 05

Posted 27 March 2012 - 10:22 AM

This is a short tutorial on the use of sentinels to control file input. A sentinel is a distinct "special" value placed at the end of a file and/or the end of each line or other segment of a file to mark the end of the data, or the end of a particular section of the data. Sentinels provide a straightforward, easy-to-understand means of controlling input loops, particularly for beginners who haven't yet learned the use of more advanced library functions for parsing lines of input.

The following program demonstrates the use of sentinels. It should be read in conjunction with the explanatory notes that follow the code.

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main() {
    const int sentinel = -1;

    // variables needed for the "application"
    string label;
    int  count = 0;
    double  sum = 0, avg;

    // variable used to find the sentinel
    double temp = 0;                  // Note 8
    // open the input file
    ifstream in("sentinel.txt");
    // verify that the input file was opened successfully
    if( !in ) {
        cout << "Unable to open input file. Exiting." << endl;
        return 0;
    // read and process the data
    while ( in >> label )            // Note 1
        while( temp != sentinel )    // Note 2
            in >> temp;              // Note 3
            if( temp != sentinel )   // Note 4
                sum += temp;         // Note 5
            {                        // Note 6
                avg = sum/count;
                cout << label << ": " << avg << endl;
        count = 0;                   // Note 7
        sum = 0;
        temp = 0;
    // close the input file
    return 0;

Note 1:
This style of controlling input is simple to use and often preferable to explicitly testing for 'eof'. Notice that the extraction operation (reading the data from the file) is done right inside the while statement, and in fact it is the while statement's control expression. When the extraction operation is successful (i.e. some "valid" data is retrieved from the file and stored in label) the expression "(in >> label)" is evaluated as "true". But when the extraction operation fails (there is no more data in the file, or the next data is of a type that can't be converted and stored in this variable label) then the expression "(in >> label)" is evaluated as "false" and ends the while loop.

It is also important to understand that the >> operator generally ignores "leading whitespace". This means that it skips over any blank spaces, 'tab', and 'newline' characters until it finds a non-whitespace character. It then reads one or more non-whitespace characters and tries to convert them into the datatype of your variable (the variable you wrote following the >>). Extraction stops when there are no more characters that are "compatible", or when it reaches another whitespace.

Be careful when using this style of input: if the while control expression is "true" and you are inside the body of the loop, the first data entry has already been read and stored. Don't try to read that same piece of data again inside the loop. If you do, you will actually overwrite it with a second data entry, in effect discarding the first one.

Note 2:
If we have successfully read the first entry on a line, we are ready to read the rest of that line. This will be done in an "inner" while loop. The inner while loop is controlled by "(temp != sentinel)", so once we find the sentinel, that loop ends. Notice that temp was initialized to 0, so we will go into this loop at least once.

Note 3:
Each data entry will now be read into a variable named temp.

Note 4:
We examine each entry to see if it is the sentinel. If the new entry is not the sentinel, we can process it as "real" data. If it is the sentinel, (the "else" condition) that's the signal that we're finished with that line.

Note 5:
Here, we have found an entry that is "real" data, so we use it in whatever operation is necessary for our application.

Note 6:
Here, we are inside the "else" block: we have found the sentinel so we know there is no more data on that line. We can do whatever processing is necessary to "wrap things up" for that line of input.

Note 7:
We've finished processing a complete line of input, so we can "reset" any temporary variables to prepare them to be used for the next line of the file. Don't forget to reset temp, or we won't be able to re-enter the loop to read the next line.

Note 8: I used double as the datatype of temp here because the "real" data it is extracting consists of doubles. But the sentinel is an int. Sometimes comparing doubles with ints to see if they are "equal" can result in errors due to rounding, so it's important to ensure that the value used as sentinel can be represented exactly in floating-point format. A sentinel value of -1 should be safe to use with any numerical data type.

Here is a data file that can be used with the sample program. Notice that this style of input will work correctly even with extraneous blank spaces in and at the ends of lines, and with extra blank lines in the middle or at the end of the file.
Also notice that the "labels" (the first entry on each line) are numbers, even though the datatype of my label variable is string. You should understand that in a text file, everything is actually characters (char), which include letters, digits, punctuation marks, blank spaces, tabs, and 'newlines', and the extraction operator >> "automatically" converts the characters into strings, or ints, or doubles, etc., depending on how you use it in your program.
1 10 5 7 -1
2 7 5 8 3  2 -1  
3 13 6 5   4 -1

4 23 -1
5 1 5 1 3  8 4 4 8 -1 

Is This A Good Question/Topic? 1
  • +

Page 1 of 1