Overflow and Underflow in C
Overview
Integer Overflow is a phenomenon that occurs when the integer data type cannot hold the actual value of a variable. Integer Overflow and Integer Underflow in C, do not raise any errors, but the program continues to execute (with the incorrect values) as if nothing has happened. It makes overflow errors very subtle and dangerous. We will see several methods to detect these errors in this article.
What is Integer Overflow in C?
Like any other variable, Integers are just some bytes of memory. All the modern computers support 32-bit and 64-bit sized integers. There are also smaller datatypes like short int that occupy 16 bits. Since a fixed amount of bits are allocated to store integers, naturally, a finite limit exists to represent an integer correctly. Such limits are defined in the header limits.h.
Example of Overflow:
Output:
In the above example, we're trying to add 1 to INT_MAX. By definition, the sum would not fit in the int data type, resulting in Overflow.
Definition
When we attempt to store a value that cannot be represented correctly by a data type, an Integer Overflow (or) Underflow occurs. If the value is more than the maximum representable value, the phenomenon is called Integer Overlow. The phenomenon is called' Integer Underflow' if the value is less than the least representable value of the datatype.
How do Integer Overflows Happen?
A computation involving unsigned operands can never overflow because a result that the resulting unsigned integer type cannot represent is reduced modulo the number that is one greater than the largest value that the resulting type can represent.
The conclusion from the above statement is that unsigned integers wrap around the maximum value, so the value never crosses the maximum value. This is similar to counting in clocks. 2 hours from 11 pm is 1 pm because we "wrap" the actual value (13) around 12. Unlike unsigned integers, signed integers have no rules on their behavior during overflow. Hence, it's also categorized as undefined behavior.
Widthness Overflows
Let us start with an example.
Output:
A 32-bit constant (0xcafebabe) is assigned to l, which is also a 32-bit datatype (int). therefore, we don't have any overflows here. But when we assign l to s, a 16-bit datatype (short int), we have an overflow. Only the last four bytes are assigned correctly, and the rest of them are "truncated". When we assign s to c, an 8-bit datatype (char). Again we have an overflow. Only the last two bytes are assigned correctly. This is due to Widthness Overflow.
When we attempt to assign a value too large for a datatype, the value gets "truncated". As a result, an incorrect value is stored in the variable.
-
Incorrect type casting: While the below line seems like a valid code because the result is stored in a long long, It still overflows because the right side is being operated in the int type.
This can be prevented by including an integer with the type long long. By doing so, the calculation on the right-side is "promoted" to long long type.
Arithmetic Overflows
Arithmetic overflows occur when the result of a mathematical operation crosses the integer limits (either minimum or maximum).
- Addition: 12000000000 + 2000000000 exceeds INT_MAX. Similarly, (-2000000000) + (-2000000000) is lesser than INT_MIN.
- Subtraction: 2000000000 - (-2000000000) exceeds INT_MAX, Similarly, (-2000000000) - 2000000000 is lesser than INT_MIN.
- Multiplication etc...
Integer Overflow Risks
The following section can be skipped if you are a beginner in C language.
Let's look at a few case studies where Integer Overflow played a vital role.
SSH Root exploit: In 2001, Researchers identified an integer overflow vulnerability, which gives root privileges to the attacker. The severity of this attack is 99! More details here.
In the above snippet, Notice the sneaky overflow at line 18. n is a 16-bit variable declared in line 7. Since n is a 16-bit integer, the attacker can send the data in such a way that, the product is greater than INT16_MAX and thus can control the xmalloc function's argument.
20-Year Old Vulnerability in Mars Rover: Lempel-Ziv-Oberhumer (LZO), is an extremely efficient data compression algorithm most commonly used for image/video data. A new integer overflow bug is found twenty years after its publishing. And thereby affecting a lot of other software which depends on this algorithm. Unfortunately, Mars Rover Operating System is one of them. It is said that the attacker can notice the bug. It's relatively easy to get access. More details here.
How to Prevent Integer Overflows
We might get overflow detection (or) prevention by default, depending on the language. In the case of C, some external libraries perform safe calculations. GCC also provides a bunch of functions for the same (discussed below). For now, we will discuss how can we detect Overflow and Underflow (in C) mathematically.
-
Addition: to detect the overflows in the sum .
-
Subtraction: to detect overflows in This is very similar to the above case.
-
Multiplication: to detect overflows in the product .
-
Division: We might think division only reduces the value in all cases. But there is one case. It's due to the absolute value INT_MIN is INT_MAX + 1. The product is also affected by this overflow.
What is Integer Underflow in C?
Integer Underflow occurs when we attempt to store a value that is "less" than the least representable integer. This is very similar to Overflow but in the opposite direction.
Example of underflow
Output
How do Integer Underflows happen?
Similar to Integer Overflow, Integer Underflows also "wrap" around the minimum value. For example, 1 hour back from 1 am is 12 am Right? Similarly, that explains why INT_MIN - 1 returned INT_MAX (2147483647) in the above example.
Integer Underflow Risks
In a Video game series Civilization, all the leaders have a score for their "aggressiveness". The game developers used 8-bit unsigned integers to represent this score. Mahatma Gandhi is the least aggressive leader in the game, with the "aggressiveness" as 1.
However, if the government in the game changed to democracy, the aggressiveness should be decreased by 2. And Since an unsigned integer is used to represent this score, 1 wrapped up to 255 and Gandhi hilariously became the "most aggressive leader" in the game.
This behavior could have been prevented by clamping the score to minimum/maximum values as below. Incrementing the largest value (or) decrementing the smallest value should not change the variable's actual value. This technique is called saturation arithmetic.
How to Prevent Integer Underflows
We can modify the (above) existing conditions to work with Integer Underflows.
-
Addition: to detect the underflows in the sum .
-
Subtraction: to detect underflows in This is very similar to the above case.
-
Multiplication: to detect underflows in the product .
How can Integer Overflows or Underflows be Exploited?
The following section can be skipped if you are not familiar with dynamic memory alloction.
Integer overflows are very subtle and often go unspotted in tests. In addition to that, overflows do not raise any errors. The program keeps on using the incorrect value. This makes integer overflows and underflows a very dangerous attack. Let's look at a few examples of how integer overflows can be exploited.
Coupled with Buffer-overflow: Integer overflow is often used along with buffer-overflow. A buffer is a place in memory where data is stored. All programs should be cautious about not writing more data than the buffer size. Because, if the data "overflows" the buffer, data outside the buffer would also be corrupted. An attacker can carefully control, what exactly should be "corrupted", by overflowing crafted data. Effective buffer-overflow attacks can lead to Remote Code Execution(RCE).
myfunction accepts an existing array (pointer) and its length as parameters and copies the array into another location. Pretty natural, huh? If the len is sufficiently large, the product len * sizeof(int) can overflow, implying that we can control how much memory is allocated. If we allocate a lesser memory than required, then the for loop is used to write arbitrary data to a location, which might give access to execute arbitrary code on the victim's machine.
Incorrect arithmetic: Simplest form of exploit. This can be prevented using safe libraries (or) mathematically, as discussed below. In this attack, we may not see any severe compromise of machines, but this is a severe attack on several critical software like bank systems, space controls, etc...
Output
In the above snippet, we're simulating a bill generation function with get_total_bill(). Everything seems correct until the item_count is 671299, which makes the bill (2147485501) greater than INT_MAX. Hence, we get an incorrect result as -2147481795, which is quite surprising and rewarding for a retail user.
Detecting Overflow and Underflow in C
The following C function, int ovfAdd(int* result, int x, int y) prints out if there is an overflow or not when adding two numbers x and y.
There can be overflow only if the signs of two numbers are identical, and the sign of sum is opposite to the signs of numbers.
If both numbers are positive and the sum is negative, that means there is an overflow, so we return -1 else; if both numbers are negative and the sum is positive, that also means there is an overflow, so we return -1 else, no overflow.
Output:
The following section can be skipped if you are not familiar with GCC Compiler.
In the case of C, the GCC compiler provides a set of functions to detect and prevent overflow/underflow errors. These functions do not differentiate between overflows and underflows.
In the above snippet, we try to add A and B and store the sum in C. If the sum crosses the integer limits, the function returns true. Otherwise, the sum is stored in C, and false is returned. For the full set of functions, Refer to GCC manual.
Conclusion
- There is a limit to almost all data types (which use fixed size) in programming languages. And crossing those limits cause undefined behaviour.
- Integer Overflow occurs when we attempt to store a value greater than the data type's largest value. Similarly, Integer Underflow occurs when we attempt to store a value that is less than the least value of the data type.
- We can detect these overflows and underflows either mathematically (or) programmatically.
- GCC has a few built-in functions which perform safe arithmetic and detects overflows.