Counting number of words in a text file...

  • (2 Pages)
  • +
  • 1
  • 2

16 Replies - 34527 Views - Last Post: 24 December 2009 - 07:15 PM Rate Topic: -----

#1 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Counting number of words in a text file...

Posted 18 December 2009 - 11:29 PM

Hi,

I'm was attempting to write a program that counts number of words in a given text file (sorry I didn't have the program write a txt file first before counting, so you'll probably need to make an output.txt file first). The problem I'm having is that the program is not telling me the right number of words in said text file. (NOTE: this is not any homework assignment I have to do, just something I'm doing myself since I'm in the process of learning C programming) Thanks for all the help in advance!! :)


#include "stdafx.h"
#include <stdio.h>

int main()
{
	int count;
	FILE *f;
	char s;

	count = 0;

	f=fopen("output.txt","r");
	
	while ((s = fgetc(f)) != EOF) 
	{
		
		if ((s = fgetc(f)) != ' ')
			count++;
	}

	fclose(f);
	printf ("\n%d words in output.txt.\n", count);

	return 0;	
}

 

This post has been edited by x2x3i5x: 18 December 2009 - 11:37 PM


Is This A Good Question/Topic? 0
  • +

Replies To: Counting number of words in a text file...

#2 janotte  Icon User is offline

  • code > sword
  • member icon

Reputation: 990
  • View blog
  • Posts: 5,141
  • Joined: 28-September 06

Re: Counting number of words in a text file...

Posted 18 December 2009 - 11:35 PM

Welcome to DIC!

Please give us some more details of your problem.
( a ) Does your code compile?
( b ) Any errors or warnings? If there are then share them with us.
( c ) Is the program producing any output?
( d ) How is the actual output different to what you want / expect? Give details and, ideally, examples.
Was This Post Helpful? 0
  • +
  • -

#3 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 18 December 2009 - 11:41 PM

View Postjanotte, on 18 Dec, 2009 - 10:35 PM, said:

Welcome to DIC!

Please give us some more details of your problem.
( a ) Does your code compile?
( b ) Any errors or warnings? If there are then share them with us.
( c ) Is the program producing any output?
( d ) How is the actual output different to what you want / expect? Give details and, ideally, examples.


Thank you for quick response.

If you did not see my changes I made to original post, please reread it again.

1. My code does compile without any errors.
2. I am using Microsoft Visual C++ 2008 Express Edition to compile the code.
3. I made a simple output.txt file and it has only one word .... text.
4. When attempting to run my code via the command prompt (or CMD) of windows vista, program tells me that I have 2 words in my text file. Program failed, as everyone knows that a text file containing the word text only has one word.

This post has been edited by x2x3i5x: 18 December 2009 - 11:43 PM

Was This Post Helpful? 0
  • +
  • -

#4 janotte  Icon User is offline

  • code > sword
  • member icon

Reputation: 990
  • View blog
  • Posts: 5,141
  • Joined: 28-September 06

Re: Counting number of words in a text file...

Posted 18 December 2009 - 11:53 PM

View Postx2x3i5x, on 18 Dec, 2009 - 10:41 PM, said:

Program failed, as everyone knows that a text file containing the word text only has one word.


So your test file "output.txt" contains only a single word?
Is that what this sentence means?

Please talk us through, in words, what these lines of code are doing (one line at a time)
    while ((s = fgetc(f)) != EOF) 
    {
        
        if ((s = fgetc(f)) != ' ')
            count++;
    }



What would happen differently if I changed this line
if ((s = fgetc(f)) != ' ')
to this?
if (s != ' ')

EDIT - Fix typo

This post has been edited by janotte: 19 December 2009 - 12:30 AM

Was This Post Helpful? 0
  • +
  • -

#5 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 19 December 2009 - 07:51 PM

Quote

So your test file "output.txt" contains only a single word?
Is that what this sentence means?


Yes, that is correct. The oputput.txt file I am testing with, when you open up the file you see ONE SINGLE WORD. (I've attached the output.txt file if you still want to see it)

Using if ((s = fgetc(f))!= ' ') makes the program tell me that output.txt file as 4 words, and using if (s != ' ') tells me that the output.txt has 3 words. Both which are incorrect.

As far as I understand, fget(f) would be returning a value for the character it is reading, so s is receiving it. I think that's right. Correct me if I wasn't, but that's what I was thinking.


So here is my code once again in full (I've added comments this time, so you can see what I was thinking when I typed the code)

#include "stdafx.h"
#include <stdio.h>

int main()
{
	int count;
	FILE *f; [color=#006600]//file pointer declared, so that program can keep track of the file being accessed[/color]
	char s; [color=#006600]//character array here to hold the characters[/color]

	count = 0;

	f=fopen("output.txt","r"); [color=#006600]//program will now go to location open up output.txt in read only mode.[/color]
	
	while ((s = fgetc(f))!= EOF) [color=#006600]//S is a character, and by using fgetc(f, the character array would have characters picked up from the txt file stored in memory and this will continue until EOF is reached.[/color]
	{
		
		if ((s != ' ')
			count++; [color=#006600]//program will check if the character at any one given point and Add one to the counter.[/color]
	}

	fclose(f);
	printf ("\n%d words in output.txt.\n", count); [color=#006600]//program will print out final number stored in counter.[/color]

	return 0;	
}

 


I realize now that there is a problem, it seems that I am just counting the number of characters there are in the text file. That's not what I want. I wanted number of words. So then I'm now thinking, how do fix code so program would recognize whether or not I'm in the middle of a word and not count it? So meaning, if file contains the words TEXT TEXT, it would know that the first T is start of a word (so it's counted as a word) then it skips along until we get to the third T, which it would recognize as start of second word and just stop counting.

Thanks for being patient with me and helping me out one step at a time. :)

-- 2x3i5x

Attached File(s)


This post has been edited by x2x3i5x: 19 December 2009 - 09:11 PM

Was This Post Helpful? 0
  • +
  • -

#6 Fib  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 161
  • View blog
  • Posts: 554
  • Joined: 12-March 09

Re: Counting number of words in a text file...

Posted 19 December 2009 - 08:51 PM

Hi x2x3i5x,

I think you may want to take a look here:

while ((s = fgetc(f))!= EOF) [color=#006600]//S is a character, and by using fgetc(f, the character array would have characters picked up from the txt file stored in memory and this will continue until EOF is reached.[/color]
	{
	   
		if ((s != ' ')
			count++; [color=#006600]//program will check if the character at any one given point and Add one to the counter.[/color]
	}



In the nested if statement if((s != ' '), it looks like to me your are checking to see if s is not a space, if it's not a space then increment count. Wouldn't you want to do the opposite if you are checking for words? You would want to test s to see if it equals a space, then if it does equal a space that means you are encountering the end of a word, and a new word will start on the next character.

So I think something like this would count the words:
while ((s = fgetc(f))!= EOF) [color=#006600]//S is a character, and by using fgetc(f, the character array would have characters picked up from the txt file stored in memory and this will continue until EOF is reached.[/color]
	{
	   
		if (s == ' ')
			count++; [color=#006600]//program will check if the character at any one given point and Add one to the counter.[/color]
	}



Now I didn't really test that, so I'm sorry if it doesn't work. But I hope you at least get the idea.

I hope that helps!
Was This Post Helpful? 0
  • +
  • -

#7 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 19 December 2009 - 09:09 PM

View PostFib, on 19 Dec, 2009 - 07:51 PM, said:

Hi x2x3i5x,

I think you may want to take a look here:

while ((s = fgetc(f))!= EOF) [color=#006600]//S is a character, and by using fgetc(f, the character array would have characters picked up from the txt file stored in memory and this will continue until EOF is reached.[/color]
	{
	   
		if ((s != ' ')
			count++; [color=#006600]//program will check if the character at any one given point and Add one to the counter.[/color]
	}



In the nested if statement if((s != ' '), it looks like to me your are checking to see if s is not a space, if it's not a space then increment count. Wouldn't you want to do the opposite if you are checking for words? You would want to test s to see if it equals a space, then if it does equal a space that means you are encountering the end of a word, and a new word will start on the next character.

So I think something like this would count the words:
while ((s = fgetc(f))!= EOF) [color=#006600]//S is a character, and by using fgetc(f, the character array would have characters picked up from the txt file stored in memory and this will continue until EOF is reached.[/color]
	{
	   
		if (s == ' ')
			count++; [color=#006600]//program will check if the character at any one given point and Add one to the counter.[/color]
	}



Now I didn't really test that, so I'm sorry if it doesn't work. But I hope you at least get the idea.

I hope that helps!



No it doesn't work. When the program now tells me I have no words in my output.txt. But there is the one word "text" in the txt file so program gave me wrong answer.

This post has been edited by x2x3i5x: 19 December 2009 - 09:15 PM

Was This Post Helpful? 0
  • +
  • -

#8 Fib  Icon User is offline

  • D.I.C Addict
  • member icon

Reputation: 161
  • View blog
  • Posts: 554
  • Joined: 12-March 09

Re: Counting number of words in a text file...

Posted 19 December 2009 - 09:30 PM

Hmm I think if you initialize count to 1 then it will be correct, because if there is only one word in the text file it will not encounter a space, if there are multiple words then it will actually encounter a space. So that's why I think count should be initialized to one.

Try putting more words in the text file and testing it. Then let me know.
Was This Post Helpful? 0
  • +
  • -

#9 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 19 December 2009 - 10:07 PM

Here is the code right now
#include "stdafx.h"
#include <stdio.h>

int main()
{
	int count;
	FILE *f;
	char s;

	count = 0;

	f=fopen("output.txt","r");
	
	while ((s = fgetc(f))!= EOF) 
	{
		
		if (s  == ' ')
			count++;
	}

	fclose(f);
	printf ("\n# of words: %d\n", count);

	return 0;	
}




After double checking setting count = 0 worked fine. I don't know why it didn't work before, but now it does oddly. I have one problem still ... when I have extra spaces between words, answer will be off by one. Else if I write a sentence in the text file as normal, then program counts the number of words correctly. How would I prevent the program from incorrectly counting number of words if there is (accidentally) one too many space between a word or between words?

Thanks for all the help so far!! :)

This post has been edited by x2x3i5x: 20 December 2009 - 12:15 AM

Was This Post Helpful? 0
  • +
  • -

#10 Omegaclass  Icon User is offline

  • D.I.C Head

Reputation: 0
  • View blog
  • Posts: 54
  • Joined: 01-November 09

Re: Counting number of words in a text file...

Posted 19 December 2009 - 11:49 PM

View Postx2x3i5x, on 19 Dec, 2009 - 09:07 PM, said:

Here is the code right now
#include "stdafx.h"
#include <stdio.h>

int main()
{
	int count;
	FILE *f;
	char s;

	count = 0;

	f=fopen("output.txt","r");
	
	while ((s = fgetc(f))!= EOF) 
	{
		
		if (s  == ' ')
			count++;
	}

	fclose(f);
	printf ("\n# of words: %d\n", count);

	return 0;	
}




I have one problem still ... when I have extra spaces between words, answer will be off by one. Else if I write a sentence in the text file as normal, then program counts the number of words correctly. How would I prevent the program from incorrectly counting number of words if there is (accidentally) one too many space between a word or between words?

Thanks for all the help so far!! :)




i did a problem like this but in a string maybe you could make the file into a string then use strtok() to do it here is my code maybe it could help you.


//Lab9_p3.cpp : main project file.

#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

int main()
{

	const int size = 255;
	int tok = 0;
	char buffer[size];
	char * token; //pointer is used to set aside a momory address to store the first token

cout << "\nWrite something interesting and i will tell you the word count\n\n";
cin.getline(buffer, size);
token = strtok(buffer, " !@#$%^&*()_+=-{}[]|\"':;<,>.?/~`"); // string, delimiters used to specify the boundary 
															 // between regions

while (token != NULL)// stop at NULL /0
{
token = strtok (NULL, " !@#$%^&*()_+=-{}[]|\"':;<,>.?/~`"); 
tok++;
}

cout <<"\n"<< tok << " words in this string";


	
	cin.get();
	return 0;

/*Write a program that counts the number of words in a string. A word is encountered
whenever a transition from a blank space to a nonblank character is encountered. Assume
that the string contains only words separated by blank spaces.
	return 0;*/

}

Was This Post Helpful? 0
  • +
  • -

#11 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 20 December 2009 - 12:12 AM

View PostOmegaclass, on 19 Dec, 2009 - 10:49 PM, said:

View Postx2x3i5x, on 19 Dec, 2009 - 09:07 PM, said:

Here is the code right now
#include "stdafx.h"
#include <stdio.h>

int main()
{
	int count;
	FILE *f;
	char s;

	count = 0;

	f=fopen("output.txt","r");
	
	while ((s = fgetc(f))!= EOF) 
	{
		
		if (s  == ' ')
			count++;
	}

	fclose(f);
	printf ("\n# of words: %d\n", count);

	return 0;	
}




I have one problem still ... when I have extra spaces between words, answer will be off by one. Else if I write a sentence in the text file as normal, then program counts the number of words correctly. How would I prevent the program from incorrectly counting number of words if there is (accidentally) one too many space between a word or between words?

Thanks for all the help so far!! :)




i did a problem like this but in a string maybe you could make the file into a string then use strtok() to do it here is my code maybe it could help you.


//Lab9_p3.cpp : main project file.

#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

int main()
{

	const int size = 255;
	int tok = 0;
	char buffer[size];
	char * token; //pointer is used to set aside a momory address to store the first token

cout << "\nWrite something interesting and i will tell you the word count\n\n";
cin.getline(buffer, size);
token = strtok(buffer, " !@#$%^&*()_+=-{}[]|\"':;<,>.?/~`"); // string, delimiters used to specify the boundary 
															 // between regions

while (token != NULL)// stop at NULL /0
{
token = strtok (NULL, " !@#$%^&*()_+=-{}[]|\"':;<,>.?/~`"); 
tok++;
}

cout <<"\n"<< tok << " words in this string";


	
	cin.get();
	return 0;

/*Write a program that counts the number of words in a string. A word is encountered
whenever a transition from a blank space to a nonblank character is encountered. Assume
that the string contains only words separated by blank spaces.
	return 0;*/

}


I'm using C programming NOT C++ ....
Was This Post Helpful? 0
  • +
  • -

#12 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 23 December 2009 - 02:36 PM

Hi,

I have attempted writing a program that counts number of words in a text file as shown below (the previous one with just counting number of spaces isn't really that great as I found out).

#include "stdafx.h"
#include <stdio.h>
#include <string.h>
#include <ctype.h>

int main()
{
	char string[1000]; //to hold the text read out of file
	char *tokenPtr;//pointer is used to set aside a momory address to store the first token
	int count_word; 
	FILE *f;
	
	f = fopen("output.txt", "r");
	count_word = 0;

	fgets(string,1000,f);
	
		tokenPtr = strtok(string, " !@#$%^&*()_+=-{}[]|\"':;<,>.?/~`");// string, delimiters used to specify the boundary between regions

		while(tokenPtr != NULL) // stop at NULL
		{
			tokenPtr = strtok(NULL, " !@#$%^&*()_+=-{}[]|\"':;<,>.?/~`"); 
			count_word++;	
		}
	
		printf("\n# of words: %d\n", count_word);
	}
		
		fclose(f);
		return 0;
}




Two problems ...

1. How to get program to correctly print out that # of words is 0 when the text file is empty? I was thinking of either checking of first character is EOF (by way of fgetc) or by checking if string[0] is a null, but I somehow can't get it working....

As far as I'm getting, program always tell me (incorrectly) that text file has 1 word if text file is empty.

2. If I have following in two words separated by a long empty line (see file attached if you don't understand what I mean), program will report that I have only one word. How to fix that?

Thanks for all help in advance!! :)

Attached File(s)


This post has been edited by x2x3i5x: 23 December 2009 - 02:40 PM

Was This Post Helpful? 0
  • +
  • -

#13 Guest_c.user*


Reputation:

Re: Counting number of words in a text file...

Posted 23 December 2009 - 06:32 PM


#include <stdio.h>

/* counts words in the file */
int main(void) /* C89 ANSI */
{
	FILE *fp;
	int r;
	size_t n;
	const char *filename = "file";
	
	if ((fp = fopen(filename, "r")) == NULL) {
		fprintf(stderr, "error: file" "\n");
		return 1;
	}	
	
	n = 0;
	while ((r = fscanf(fp, "%*100s")) != EOF)
		n++;
	
	if (ferror(fp) != 0) {
		fprintf(stderr, "error: read file" "\n");
		fclose(fp);
		return 1;
	}	
	
	if (n == 1)
		printf("there is %lu word" "\n", n);
	else
		printf("there are %lu words" "\n", n);
	
	return 0;
}


Was This Post Helpful? 1

#14 x2x3i5x  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 26
  • Joined: 18-December 09

Re: Counting number of words in a text file...

Posted 23 December 2009 - 11:04 PM

View Postc.user, on 23 Dec, 2009 - 05:32 PM, said:


#include <stdio.h>

/* counts words in the file */
int main(void) /* C89 ANSI */
{
	FILE *fp;
	int r;
	size_t n;
	const char *filename = "file";
	
	if ((fp = fopen(filename, "r")) == NULL) {
		fprintf(stderr, "error: file" "\n");
		return 1;
	}	
	
	n = 0;
	while ((r = fscanf(fp, "%*100s")) != EOF)
		n++;
	
	if (ferror(fp) != 0) {
		fprintf(stderr, "error: read file" "\n");
		fclose(fp);
		return 1;
	}	
	
	if (n == 1)
		printf("there is %lu word" "\n", n);
	else
		printf("there are %lu words" "\n", n);
	
	return 0;
}



Your program does work, but few questions here:
1. what is size_t n;? I get rid of that and simply declare n after r in the int declaration part and code still works
2. what does "%*100s" mean in when you wrote "r = fscanf(fp, "%*100s""?

Thanks!
Was This Post Helpful? 0
  • +
  • -

#15 Guest_c.user*


Reputation:

Re: Counting number of words in a text file...

Posted 24 December 2009 - 12:53 AM

x2x3i5x said:

what is size_t n;?

if you can have more than 32 767 words in the file this type gives you ability for 4294967295 words (size_t type is the greatest unsigned type)
you can also use the long int type for 4294967296/2 - 1 words

x2x3i5x said:

what does "%*100s" mean

* - don't save the string
100 - a maximal length of the word (if you have a word that is longer than 100 characters it will be splitted to two words, but if you have a word for five characters it will take only those five characters and no more)
also the variant with flags is available, but I thought it could be harder for you

This post has been edited by c.user: 24 December 2009 - 12:54 AM

Was This Post Helpful? 0

  • (2 Pages)
  • +
  • 1
  • 2