Counting the number of words in a string is a common task in text processing and programming exercises. It helps beginners practice string handling, loops, and conditional logic in C. By learning this program, you will understand how to iterate through characters, identify word boundaries, and handle different spacing scenarios.
This tutorial provides a complete C program to count words in a string, explains each line in detail, and presents alternative approaches. By the end, you will understand both the logic and implementation required for counting words efficiently.
What Is A String?
A string in C is an array of characters ending with a null character \0
. Words are sequences of characters separated by spaces, tabs, or newline characters. To count words accurately, we need to detect transitions from spaces to non-space characters.
This program also teaches beginners to carefully handle edge cases, such as multiple spaces, tabs, or leading/trailing spaces. By mastering word counting, you gain foundational skills for text analysis, input processing, and file handling in C.
Understanding the Problem
The task is to read a string from the user and count the number of words. A word is defined as a continuous sequence of alphabetic or numeric characters separated by spaces, tabs, or newline characters.
Key points to consider include ignoring multiple consecutive spaces, correctly counting the first and last words, and safely handling input using fgets()
to prevent buffer overflow.
Program 1: Using a Flag to Track Words
The simplest method is to use a flag that tracks whether the current character is inside a word.
#include <stdio.h>
#include <ctype.h>
int main() {
char str[200];
int i, words = 0, inWord = 0;
printf("Enter a string: ");
fgets(str, sizeof(str), stdin);
for(i = 0; str[i] != '\0'; i++) {
if(str[i] == ' ' || str[i] == '\n' || str[i] == '\t') {
inWord = 0;
} else if(inWord == 0) {
inWord = 1;
words++;
}
}
printf("Number of words: %d\n", words);
return 0;
}
This program reads the input string using fgets()
. It iterates over each character and uses a flag inWord
to track whether the loop is currently inside a word. When a space, tab, or newline is encountered, the flag resets to 0. When a non-space character is found while inWord
is 0, it marks the start of a new word and increments the word count. This method accurately handles multiple spaces or tabs between words.
Program 2: Using isspace()
Function
C provides the isspace()
function in ctype.h
to check for spaces, tabs, and newlines. Using this function simplifies the code.
#include <stdio.h>
#include <ctype.h>
int main() {
char str[200];
int i, words = 0, inWord = 0;
printf("Enter a string: ");
fgets(str, sizeof(str), stdin);
for(i = 0; str[i] != '\0'; i++) {
if(isspace(str[i])) {
inWord = 0;
} else if(inWord == 0) {
inWord = 1;
words++;
}
}
printf("Number of words: %d\n", words);
return 0;
}
Here, isspace()
automatically checks for ' '
, '\t'
, '\n'
, and other whitespace characters. This approach is cleaner and reduces the number of manual comparisons.
Program 3: Using strtok() for Word Tokenization
C provides the strtok()
function to split a string into tokens using delimiters such as spaces or tabs. This method is very convenient but slightly less efficient due to internal string modifications.
#include <stdio.h>
#include <string.h>
int main() {
char str[200];
char *token;
int words = 0;
printf("Enter a string: ");
fgets(str, sizeof(str), stdin);
// Remove trailing newline from fgets
str[strcspn(str, "\n")] = '\0';
token = strtok(str, " \t");
while(token != NULL) {
words++;
token = strtok(NULL, " \t");
}
printf("Number of words: %d\n", words);
return 0;
}
This program splits the input string using spaces and tabs as delimiters. Each token represents a word. The strtok()
function modifies the original string, so it’s important to keep that in mind if the original string is needed later. This approach is very readable and often preferred in practical applications.
Program 4: Ignoring Standalone Punctuation
In natural text, punctuation marks such as .
, ,
, !
, or ?
often appear separated by spaces. If we count them directly as tokens, they increase the word count incorrectly. For example, in the string "Hello , world !"
, both ","
and "!"
would be counted as words in Program 3.
To solve this, we extend the strtok()
approach by adding a helper function that checks whether a token is only punctuation. If the token is made entirely of punctuation characters, it is ignored. Only tokens with letters or digits are counted as words.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdbool.h>
// Function to check if a token is only punctuation
bool is_only_punctuation(const char *token) {
for (int i = 0; token[i] != '\0'; i++) {
if (!ispunct((unsigned char)token[i])) {
return false; // Found a non-punctuation character
}
}
return true; // All characters were punctuation
}
int main() {
char str[200];
char *token;
int words = 0;
printf("Enter a string: ");
fgets(str, sizeof(str), stdin);
// Remove trailing newline from fgets
str[strcspn(str, "\n")] = '\0';
token = strtok(str, " \t");
while (token != NULL) {
if (!is_only_punctuation(token)) {
words++;
}
token = strtok(NULL, " \t");
}
printf("Number of words: %d\n", words);
return 0;
}
The program still uses strtok()
to split the input string into tokens by spaces or tabs. To avoid counting punctuation marks as words, a helper function is_only_punctuation()
is introduced. This function checks each token and returns true
only if every character is punctuation. In the main loop, tokens are counted as words only if they are not punctuation alone.
For example:
Input: Hello , world !
Output: Number of words: 2
This approach is useful when dealing with text where punctuation marks may appear separated but should not be treated as words. It provides a more accurate word count compared to Program 3.
Program 5: Recursive Word Counting Ignoring Standalone Punctuation
This program counts words recursively, ignoring tokens that consist only of punctuation. First, the input string is split into tokens using strtok()
and stored in an array. The recursive function countWordsRecursive()
processes each token one by one. If a token contains only punctuation, it is skipped; otherwise, it is counted as a word. The function continues until all tokens are processed, returning the total word count.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdbool.h>
bool is_only_punctuation(const char *token) {
for (int i = 0; token[i] != '\0'; i++) {
if (!ispunct((unsigned char)token[i])) return false;
}
return true;
}
int countWordsRecursive(char tokens[][50], int totalTokens, int index) {
if (index >= totalTokens) return 0;
if (is_only_punctuation(tokens[index]))
return countWordsRecursive(tokens, totalTokens, index + 1);
return 1 + countWordsRecursive(tokens, totalTokens, index + 1);
}
int main() {
char str[200], tokens[100][50];
int tokenCount = 0;
printf("Enter a string: ");
fgets(str, sizeof(str), stdin);
str[strcspn(str, "\n")] = '\0';
char *ptr = strtok(str, " \t");
while (ptr != NULL) {
strcpy(tokens[tokenCount++], ptr);
ptr = strtok(NULL, " \t");
}
int wordCount = countWordsRecursive(tokens, tokenCount, 0);
printf("Number of words: %d\n", wordCount);
return 0;
}
This recursive approach demonstrates an elegant alternative to iteration while maintaining accurate word counting.
Performance Comparison
Method | Time Complexity | Space Complexity | Notes |
---|---|---|---|
Flag Method | O(n) | O(1) | Simple, handles multiple spaces efficiently |
isspace() | O(n) | O(1) | Cleaner, handles all whitespace characters |
strtok() | O(n) | O(1) | Readable, modifies the string, may be slower for very large strings |
Recursive strtok | O(n) | O(n) | Elegant and concise, ignores standalone punctuation, slightly higher overhead due to recursion |
All methods have linear complexity with respect to the string length. The choice depends on readability, whether the original string must be preserved, and whether recursion is acceptable for the input size.
Which Method Should You Use?
For beginners, the flag method is easiest to understand. It explicitly shows how word boundaries are detected using spaces and tabs.
The isspace()
method is recommended for cleaner and more maintainable code. It automatically considers all types of whitespace characters.
The strtok()
method is ideal if you need to process words individually after counting, such as storing them in an array or performing further operations.
In most real-world applications, the isspace()
method strikes the best balance between readability and correctness.
FAQs
1. Can this program handle empty strings?
Yes. If the input string is empty or contains only spaces, the word count will be zero.
2. Can it handle multiple spaces or tabs between words?
Yes. Both the flag method and isspace()
correctly handle multiple consecutive whitespace characters.
3. Which method is fastest?
For most strings, all methods are linear in time complexity. The flag or isspace()
methods are slightly more memory-efficient than strtok()
.
Conclusion
Counting words in a string is a foundational exercise in C programming. It teaches string handling, loop logic, and conditional checks, which are critical for text processing tasks.
You learned three main approaches: the flag method, isspace()
method, and strtok()
tokenization. Each has advantages depending on readability, maintainability, and specific use cases. By practicing these programs, you gain a better understanding of string processing in C.
References & Additional Resources
A selection of books, tutorials, and documentation for learning C strings, string functions, and text processing.
- Kernighan, Brian W., and Dennis M. Ritchie. The C Programming Language, 2nd Edition, Prentice Hall, 1988 – The foundational text covering string handling, arrays, and core C concepts.
- GeeksforGeeks: Count Words in a String – Step-by-step explanation of different approaches to count words in a string using C.
- Tutorialspoint: C Strings – Beginner-friendly guide explaining string declaration, initialization, and operations.
- GeeksforGeeks: strtok() and strtok_r() functions in C with examples – Tutorial on tokenizing strings using
strtok()
for parsing and splitting. - cplusplus.com: ctype.h Library – Documentation for character classification functions like
isalpha()
,isdigit()
, and others.