Common Mistakes in C
Overview
C is one of the most challenging and syntax-specific languages today. It is extremely common to make mistakes while coding in the language. These common mistakes in c, also called errors, are detected during either compile-time or runtime. While there is no specific manual to follow to avoid mistakes, there are a few common ones that we can look out for and steer clear of.
Introduction
C is a programming language that was created to program the UNIX operating system back during the 1980’s. Today, it is one of the most popular programming languages in the tech world, and knowing how to write concise and incisive code in the language is a huge plus on any resume.
C finds its application in database systems, graphics packages, word processors, spreadsheets, operating system development, compilers & interpreters, network drivers, assemblers, etc. With such a vast area of applications, C is undoubtedly one of the most important inventions that has revolutionized the world.
Despite its immense popularity, C has also gained notoriety for its rigorous syntax. Even seasoned programmers frequently make common mistakes in their code, mistakes if not caught and rectified, can cause severe security risks to the applications.
We will now discuss some common errors in C that we make while coding and how to rectify them.
What are Errors in C?
Errors are mistakes or flaws in a program that causes the program's behavior to be abnormal. Programming errors are often known as bugs or faults, and debugging is the act of eliminating these issues.
These errors emerge due to some unintentional mistake of the developer. Generally, errors are classified into five types:
- Syntax errors - related to the syntax declarations of functions.
- Semantics errors - where the compiler isn't able to understand what is happening.
- Runtime errors - where the program is not able to perform the operation
- Logical errors - where the meaning of the program isn't what you intended, resulting in undesired outputs
- Linker errors - where the executable file isn't created correctly or isn't created at all.
Errors are discovered during the compilation or execution process. As a result, the errors must be eliminated for the program to run successfully.
Common Errors In C
Below is the curated list of a few common errors in c you should look after if you're facing some issues.
Matching Braces with Parentheses
Every opened parentheses (‘{‘) must have closed parentheses (‘}’). This error isn’t very common, especially if you use a good code editor such as VSCode or Sublime Text that automatically creates a set of them when you type in a ‘{‘.
It is one of the most common errors in c, if you use a basic code editor that doesn't have auto-indentation and bracket matching, such as Notepad, the chances of having missing or mismatched parentheses drastically increase.
Using a well-formatted code editor helps detect and avoid this error.
--> Incorrect Way to Write Code
The above code will give the following error message, as there is a missing parenthesis on Line 5:
--> Correct Way to Write Code
The missing parentheses error is a compile-time error.
Forgetting Semicolon
If you’re a coder like me, you too would’ve ended up in a situation where after coding at least 30 lines in C, you realize you’ve forgotten to insert semicolons after each line!
Fortunately, code editors like VSCode and Sublime Text easily flag such errors and bring them to our attention. Some advanced code editors can insert missing semicolons for you too! You must write the semicolon after each statement to avoid this standard error in c.
--> Incorrect Way to Write Code
--> Correct Way to Write Code
Using = instead of ==
This is an error that happens in many other programming languages. We need to remember that = is an assignment operator, and == is a comparison operator. In C, '=' is used to assign value to variables. For example, in int c = 10;, the assignment operator assigns the value of 10 to variable c. This is also one of the most common errors in C where beginner programmer gets stuck.
The '==' operator is used to compare the values on the operator's left to the value on the right. For example, in the statement if(a == b), the comparison operator checks if the values of variables a and ```be`` are equal. Based on this result, the operations to be conducted are mentioned in the lines following this statement.
This is one of the problematic errors to identify, as it is primarily a semantics error (i.e., whether the statement's meaning is correct.) The only way to rectify this mistake is to check how the code runs manually.
--> Incorrect way to write code:
The above code gives the following output:
This is because '=' is an assignment operator that assigns 'a' the value of 'b'. Hence when the statement inline 7 is run,the if statement returns true, and the respective code is run.
--> Correct way to write code:
Here, we have corrected the operator in line from assignment to comparison. The output of the above code is as follows:
Signed Integers in Loops
A signed integer in C is a data type that can hold values ranging from -2,147,483,648 to 2,147,483,647. If the value held by the signed integer variable is 2,147,483,647, and a 1 is added to it, it flips from positive to negative before looping around and returning to -2,147,483,648. An infinite loop may be created if you use a signed integer and expect it to act as an unsigned integer. For example, if you’re using an int variable to loop from 0 to 3000000000:
The int has a size of 4 bytes, i.e. 16 bits. Hence it can only hold values in the range of [-2,147,483,648 - 2,147,483,647]. Once this limit is reached, it flips back to -2,147,483,648. Hence, the value will never reach 3000000000, resulting in an infinite loop.
Not Terminating a String
The strings are the array or sequence of characters in the C programming language, and it is necessary to define the end of this sequence, this is called termination. The terminating character '\0', whose ASCII value is zero, is used for this purpose. Forgetting this terminating character might result in some error.
The character array which is not terminating is a collection of characters. That's why the functions which manipulate string or the way C language interprets the string will cause an error. --> Incorrect Way to Write Code:
Although both strings are the same, the code will not give any output because the strcmp function is trying to reach the null character of the s2 string, which doesn't exist.
--> Correct Way to Write Code:
The above code will give the output:
Forgetting a Loop's Exit Condition
Whenever we work with loops, especially while loops, it is important to see if there is a valid exit condition and if the loop has a way to reach that exit condition. Otherwise, we will end up with an infinite loop that will use up all the system memory.
Constantly updating the variable is the biggest priority while working with while loops.
--> Incorrect Way to Write Code:
Since there is no exit condition, the above code will give the output:
--> Correct Way to Write Code:
The above code will give the output:
Forgetting to Initialize a Pointer
Every variable type in C, not only pointers, must be initialized before it can be used. This is to ensure correct memory allocation occurs. Variables are defined and assigned in two phases in C.
It would be ideal if all specified variables were set to zero or NULL initially, but that isn't the case. Initializing a variable, especially pointers, is the responsibility of the programmer.
The main risk of not initializing pointers is producing undefined behavior. Undefined behavior may include storing garbage values, memory access outside bounds, signed integer overflow, data races, etc.
Let's take an example of the following statement:
A wild pointer that has not been initialized to anything (not even NULL) is a wild pointer. An uninitialized pointer stores an undefined value and can produce unexpected results. So, it is advisable to start with a NULL initialized pointer or to initialize a pointer afterward. Only to keep in mind, don't let your pointer go wild :)
Manipulating Pointers in Functions
Unlike other parameters supplied to a function, the value of a pointer cannot be changed within the function. This is a strange notion, but understanding how it works will help you avoid trouble:
- A pointer is a memory address that can be accessed and used within a function, but it cannot be changed.
- Instead of passing the value directly, you must supply a pointer-pointer(pointer to a pointer), or the address of the pointer, to modify the address. This is the proper answer, but alas, it increases the code's complexity.
--> Incorrect way to Manipulate pointers in Functions:
The above program will give the output:
APPLE
Let's understand what is happening here.
- We have a string message declared as a pointer in the main() function.
- The address of this pointer is passed on to the display() function, which manipulates it to display our message.
- The putchar() function displays each character in our message at a time on the terminal.
- Yet, however, we see that the output is APPLEA instead of APPLE.
Why is this so?
The answer is simple. Only the pointer's address is passed on to the display() function in the above program. The pointer variable remains in the main() function. Hence when the display() function returns, the putchar() function displays the A in the message. That is because the address in message hasn't changed in the main() function.
To avoid this problem, we need to be careful in manipulating pointers. One way to fix the above problem is as follows:
The above code will give the following output:
APPLE
Here inside the display() function, the characters in *message are referenced as **ptr. This **ptr stores the contents of the address stored at address ptr. Therefore, *p is now a memory location, not a character. The value passed to the function is now held by this variable. to access this memory location's value, we use **ptr in function display().
In the display() function, the (*ptr)++ expression is used to manipulate the address stored in *ptr. The main difference between the above erroneous code and this new code is that the pointer's address is changed in the display() function. When the function returns, the address stored in the message references the \n character displayed in the output.
Writing Pointers to File
When we do some kind of operations on file through the c program, The access of the file by pointer may also cause errors in a few cases i.e. Reading in a file that doesn't exist, Writing in a restricted or read-only file, Trying to use a file without opening, passing the wrong parameter as a mode to use file, etc. These errors are shown up at the runtime of the program. Here we will explore a few examples to get an insight into this kind of error.
--> Incorrect Way to Write Code:
The output of the above code is as follows: The segmentation fault occurs when the program tries to access illegal memory locations. The same kind of error will be shown if we try to open a file which doesn't exist.
--> Correct Way to Write Code:
Here is another example, --> Incorrect Way to Write Code:
This write operation in the code will not do anything because the file is opened in the reading mode.
--> Correct Way to Write Code:
The above code gives the output in myFile.txt:
scanf() Blunders in C
We use an ampersand (&) to scan elements using scanf because values need to be passed through reference, and '&' is used to reveal the memory address at which the values are being stored. The ampersand (&) allows us to pass the address of the variable number, which is the place in memory where we store the information that scanf reads. Omitting the ampersand while using scanf() may result in errors.
For example, --> Incorrect Way to Write Code:
The output of the above code is as follows:
--> Correct Way to Write Code:
Here, the ampersand(&) is placed in the correct position.
Reading Array Out of Bounds
The arrays are the collection of elements stored in consecutive memory locations, and the program accesses this location through indexing. C does not provide any protection while accessing invalid indexes. So in some cases, when a program tries to access the invalid index, it will cause an error, for example, if the array is of length five and the index being accessed is 7. This particular event falls under what is officially called the 'Undefined Behaviour' (UB). A UB results from executing code whose behavior is not correctly defined.
--> Incorrect Way to Read Arrays:
The above code gives the output:
We can see that arr[10] is accessing a garbage value.
The only correct way to avoid this error is to stay within the array's limits.
Conclusion
In this article, we have discussed:
- What mistakes and errors are
- The nine most common mistakes programmers tend to make in C.
- The ways to avoid/remove these errors.
The best way to avoid these common errors in c is via experience. Still, even that is not a guarantee as some of the best and most seasoned software developers also make the occasional mistake. Happy coding!