Dealing with Ordinary Characters
In this lesson, we will understand how the scanf() function handles regular characters in the format string.
In this lesson, we will understand how the scanf() function handles regular characters in the format string.
Let’s go back to the concept of the format string. The format string is the initial string that we provide to the scanf function, and it contains format specifiers. So far, we have seen the most common type of scanf function where the format string has format specifiers closely placed together, like this:
scanf("%d%f%d%d", &p, &q, &r, &s);
However, it’s important to note that this is not the only way to format the format string. We can also include whitespace characters (like tabs or newlines) and regular characters in the format string, similar to printf. But how does scanf handle these characters? Let’s explore this question now.
When it comes to whitespace characters in the format string, they don’t really matter to scanf. Whenever scanf encounters a whitespace character in the format string, it keeps reading whitespace characters from the input until it finds a non-whitespace character.
Consider the following call to scanf function:
scanf(“%d %d”, &p, &q);
And the user input is:
10 20
At the beginning, the buffer looks like this:
First, scanf looks for an integer and stores it in the variable ‘p’. Since the user entered the number 10, scanf stores it in ‘p’. After fetching the input 10, the buffer looks like this:
The next character in the format string is a whitespace character. So, scanf
keeps reading whitespace characters from the input (in this example, there are a total of 3 whitespace characters) until it encounters a non-whitespace character. It puts that non-whitespace character back into the buffer. This means that scanf will ignore all 3 whitespace characters in the input, and when it comes across the digit 2 in the buffer, it simply puts it back. After performing these steps, the current state of the buffer looks like this:
Finally, after reading and discarding the whitespace characters, scanf reads the integer 20 from the buffer and stores it in the variable ‘q’.
It’s important to note that including a whitespace character in the format string doesn’t require the user to enter a whitespace character. This means that scanf("%d %d", &p, &q);
and scanf("%d%d", &p, &q);
are equivalent.
In conclusion, a whitespace character in the format string matches any number of whitespace characters in the input, including none.
What about ordinary characters like ‘a’, ‘b’, ‘c’, ‘d’, ‘@’, ‘!’, ‘/’ etc.? When the scanf function encounters a regular character, it compares that character with the character entered by the user. If there is a match, it continues reading the input. If there is no match, it puts the user-entered character back into the buffer.
Let’s consider an example:
scanf(“%d@%d”, &p, &q);
And the user enters:
10@20
Now, let’s look at the format string "%d@%d"
. The first format specifier is %d
, so scanf reads an integer (10) and stores it in the variable ‘p’. The next character in the format string is ‘@’, so scanf matches the ‘@’ character with the next character in the input sequence. It’s a match. Therefore, it continues with the next character in the format string, which is %d
. This means the next input must be an integer, and fortunately, it is the integer 20. scanf reads this integer and stores it in the variable ‘q’. In this way, the entire input sequence is read.
Note that the ‘@’ character is skipped by scanf. So, we can conclude that if the regular characters in the format string match the characters in the input sequence, scanf will skip those characters.
Now, let’s consider the same scanf call, but with a different input sequence:
10 @ 20
Hence, the buffer should look like this:
After reading the integer 10, scanf expects the ‘@’ character in the input sequence because it is the next character in the format string after ‘%d’. However, in the input sequence, after the value 10, the next character is a whitespace character, not ‘@’. Therefore, scanf puts the whitespace character back into the buffer and stops.
The new state of the buffer is shown below:
The above input will be read by the next scanf call.
In order to read any number of whitespace character after 10, we need to change the format string in the scanf function as follows:
scanf(“%d @%d”, &p, &q);
After ‘%d’, the next character is a whitespace character, indicating that zero or more whitespace characters can be accepted from the user. Then, the next character must be ‘@’, followed by an integer. Note that it is allowed to add whitespace characters after ‘@’ in the input sequence because a whitespace character before a format specifier has no significance. So, scanf("%d @%d", &p, &q);
and scanf("%d @ %d", &p, &q);
are equivalent.
Let’s examine the following program:
#include <stdio.h>
int main()
{
int p, q;
printf("Enter the input: ");
scanf("%d@%d", &p, &q);
scanf(" @%d", &q);
printf("The values of p and q are %d and %d", p, q);
return 0;
}
Suppose the user provides the input “10 @ 30”. The buffer will look like this:
The “\n” represents the newline character, which is added because the user presses enter after entering the input.
Now, let’s consider the first call to scanf. The first character in the format string is “%d”, so it successfully fetches the integer 10 and stores it in variable p. The current state of the buffer after fetching the value 10 is as follows:
There is a whitespace character before the “@” in the buffer, and scanf will take that character. However, the next character in the format string is “@”. Therefore, scanf puts back the whitespace character and stops. The current state of the buffer is as follows (I intentionally removed the first two blocks of the buffer to indicate that there is only one whitespace character before “@”):
Now, the next scanf call (scanf(" @%d", &q);
) takes over. It looks for zero or more whitespace characters in the buffer because the format string has a whitespace character before the “@”. There is a whitespace character in the buffer, so scanf picks it up and simply discards it. Now, the buffer looks like this:
The next character in the format string is “@”, and fortunately, after reading the whitespace character, “@” is the next character in the buffer. The scanf function picks it up from the buffer and discards it. Now, the buffer looks like this:
Next, scanf expects an integer because the next character in the format string is the format specifier “%d”. It is not mandatory for the next character to be an integer in the buffer; it can be a whitespace character as well. The scanf function simply discards it and continues until it encounters an integer. Therefore, the second scanf function successfully reads the integer 30 and stores it in the variable q.
So, the final output is
The values of p and q are 10 and 30
This program demonstrates how the second scanf call takes control of the buffer when the first scanf fails to read the complete input sequence. However, please note that if you remove the whitespace character before the “@” in the second scanf call, it will fail to read the integer 30. In that case, scanf expects “@” to be the first character in the buffer, but the first character in the buffer is a whitespace character. This is why a whitespace character is added before “@” in the format string of the second scanf call.
Leave a comment