Thread-Safe Code in Java
Overview
In Java, multithreading helps improve the performance of applications by allowing them to perform multiple tasks concurrently. However, in a multithreaded environment, ensuring thread safety is crucial. A thread-safe program guarantees that the output is predictable, regardless of the number of threads executing the program and the order in which they do so.
Example of Thread-safe Code in Java
Let us understand thread safety with the classic example of an application that adds money to a bank account. Let us first design an Account class
Now let's say there are two concurrent transfers happening to a particular account
We would expect the final balance of the account to be 600, however, this is not the case always. The final balance could be either 100, 500, or 600. How you ask?
This is because the operation addMoney is not atomic
The addition of money involves a sequence of 3 operations:
- get the current balance
- add amount to the balance value fetched in step 1
- update the current balance
For simplicity, assume that there are two threads A and B in the same JVM which is performing the additions of 100 and 500 respectively.
Since steps 1 to 3 are not atomic, they can happen in any order across the two threads. If thread B reads the balance before thread A updates it, or vice-versa, we can get different results for the final balance.
How can we avoid this? Either We need to make the addMoney operation this.balance += amount atomic or ensure that at a time only one thread can execute the operation, even if concurrent threads are scheduled.
This will ensure that if thread A performs step 1 first then it is guaranteed that step 3 will be executed by thread A only before thread B enters the critical section of the code (critical section is that part of a program which is can lead to non-deterministic outputs if thread safety is not ensured).
In this article, we shall discuss several ways to achieve atomicity or synchronization between threads.
Using Synchronization
Java has a synchronized keyword which ensures the following:
- Shareable resources are operated upon by only one thread at a time
- A synchronized block of code is executed by a thread only once at a time
There are two ways to achieve synchronization. We will take the same example of a bank account to explain the concept:
Method 1: Make the shareable resource synchronized
Method 2: Making the method that has the critical section synchronized
Synchronization works as a thread that tries to access a shared resource or enter a critical section of a code that needs to first hold a lock on that resource. Only once the lock is released can other threads access the shared resource.
Using Volatile Keywords
One of the key characteristics of a thread in Java is that a thread has its own memory space. Whenever a thread executes a piece of code, the thread caches all the variables encountered in the thread's own memory space. Thus, the memory spaces of the two threads are independent. Threads can have their own copies of shared variables. This can lead to the same non-deterministic scenarios that we have discussed above in the problems without synchronization.
The volatile keyword makes threads load shared resources from the program's main memory. This ensures that multiple threads reading and writing on the same variable share the same copy of data
We should note that the volatile keyword does not make an operation atomic. For operations that follow the read/modify/write cycle, the volatile keyword is still not thread-safe. This is because the atomicity of an operation is not guaranteed even if you qualify a variable on which the operation is being done as volatile.
In the above example, volatile keywords would help us in achieving synchronization if there is only one write thread and several read threads. Since there is only one thread that can modify the value of the account balance, the write operation is thread-safe. Once the write thread modifies the value, the value is updated in the main program's memory immediately instead of being cached in the thread's memory. Thus all read threads shall report a consistent value of the account balance.
Using Atomic Variable
Java provides us with data types that are atomic out of the box. Let us solve for synchronization in our bank problem using atomic integers
The addMoney operation in the above code snippet is atomic and hence we do not have to worry about thread safety in concurrent modifications.
Using Final Keyword
The final keyword in Java is used to declare variables as immutable. Thus, once a variable is declared and a value is assigned to it, if any modifications are made in the code to it later, we get a compile-time error. Since final objects cannot be assigned any new value, in a way they become thread-safe when it comes to concurrent reads. This is because during the life cycle of our application the final variable and the variable's cached value will have the same value.
A key point to understand about final variables is that only the reference is immutable. Thus if we declare a final Collection such as an ArrayList, we can always add/ remove items from the list.
Thus concurrent modifications to final collections are still possible. Hence we still need to solve for thread safety in final collections using synchronization or by using Java's thread-safe data structures like ConcurrentHashMap.
Important Points About Thread-Safety in Java
- Any resource which cannot be accessed by multiple threads is inherently thread-safe. Thus local variables have no issues with thread-safety
- Correspondingly static variables can be very dangerous with respect to thread safety as their scope is global and are accessible by all threads
- If none of the approaches to achieve thread safety discussed above work, or is not easy to implement, various applications solve for thread safety by handling it at an application layer. We can write code to hold lock over resources and only release them once we are certain the operation we were trying to execute is completed
- Java has several data structures like HashTable, and ConcurrentHashMap which are thread-safe. One of the most popular data types String is also thread-safe as String is immutable
Conclusion
- We use threads in java to achieve concurrency. However concurrent operations can lead to non-deterministic outputs every time we execute a piece of code that is executed by multiple threads
- We can use Java's synchronized keyword over shareable resources or around critical sections of code
- volatile keyword prevents threads from caching variables in their own memory space and enforces them to update every such variable in the program's main memory
- atomic variables in Java give us atomic operations which are thread-safe axiomatically
- final keyword makes objects immutable and hence the same reference of a variable is shared across all threads
- If none of the above approaches suits our use case, we need to write application-level code to implement a locking mechanism over resources that are not thread-safe