Concept of Threads
Overview
Nodejs runs the javascript code using only a single thread, which means only one statement can be executed at a time. However, Nodejs itself is multithreaded and supplies hidden threads in js using the libuv library, which handles operations like network requests or reading a file from a disk. Nodejs introduced the worker threads module, which allows us to create multiple tasks in parallel and execute those tasks. These worker threads divide CPU-bound tasks to many workers for optimization and do not block the main thread.
Pre-Requisites
- Node.js environment must be installed in your system (windows, macOS, or Ubuntu).
- A multi-core system with four or more cores. You can follow the tutorial if you have a dual-core system, except creating and executing CPU-bound tasks require four or more cores to check performance improvement.
- A good understanding of the callbacks, event loops and promises in JavaScript.
- Basics of Javascript synchronous and asynchronous programming concepts.
- Some knowledge of how to set up a Node.js server using an express framework.
Introduction to Threads in JS
Threads are like a small process: they have their own instruction pointer and can conduct one Javascript task at a time. Threads reside within a process's memory and they don't have their own memory. The execution of threads is similar to that of processes.
Javascript is single-threaded by default which means only one thread is available to perform all the operations. This was the main drawback of the javascript which led to the implementation of Server-side Asynchronous I/O, which would reduce the race among the thread in a multithreading environment.
Lets us now see how javascript has evolved. Javascript has introduced a new concept to prevent the limitation we are facing, that is, Web Workers. Web Workers (threads in js) are mainly used to perform CPU-intensive Javascript tasks. They can perform long-running tasks without affecting or blocking the main execution thread.
Using a Constructor, we can create a worker object, which will run a js file. This js file includes code to run the worker thread; these workers run in some other global context which is different from the current window. However, Node.js `built-in asynchronous I/O operations are more efficient than Workers can be. Web workers are not of much help with I/O-intensive work.
How Do Worker Thread Work?
Worker threads in js were submitted as an experimental feature in Node.js version v10, but they became stable in version v12 of Node.js. Worker threads provide us with APIs to deal with CPU-intensive tasks without blocking the main thread; hence they maintain the responsive application. Unlike clusters or child_process, worker threads share memory among themselves by transferring ArrayBuffer instances.
The worker thread gives CPU-intensive tasks to another thread while keeping the main thread available to new user requests. When we create a new worker thread, it is associated with a new event loop, so the main thread's event loop can take any further requests. Worker threads in js don't have any synchronization mechanism like any multi-threaded programming language. Chrome's V8 engine is used to run the Node process. Each worker thread is isolated from other threads. The V8 engine allows us to create isolated V8 runtimes. V8 Isolate is isolated instances with their own Javascript heaps and micro-task queues.
Let us take a minimal example to see how worker threads in js can take the load off the main thread and keep your app responsive.
Example: Suppose the file name is index.js
Output:
Explanation:
In the above code, we have created a new worker thread in js using the worker constructor. As soon as the worker thread is initiated, its execution starts. The parentPort is the communication port back to the main thread. It also acts as the event emitter. Any data that we pass from the worker thread gets passed back to the main thread through an event. This event is known as the message event and it gets transmitted through the parentPort. We have associated the message event with an event handler. This event handler prints the data that is passed through this event.
From the output, it is apparent that the main event loop was not blocked because we immediately got the output from the main thread. The message passed back from the worker thread shows that the cpuIntensiveFunction function is still blocking an event loop of the worker thread, which is separate from the main event loop.
What Distinguishes Worker Threads in js?
Worker threads in js is different from a normal thread in many ways :
- Transfer of memory from one thread to another happens with the help of ArrayBuffers.
- SharedArrayBuffer also lets you share memory between the threads limited to binary data. It will be accessible from either thread.
- Atomics is a means that allows you to run several processes concurrently and more efficiently, saving time and allowing you to use JavaScript to add conditional variables.
- MessagePort is a communication channel that allows many threads to share data with each other. It is also used to send memory regions, structured data, and other MessagePorts from one worker to another.
- MessageChannel depicts an asynchronous, two-way communication channel that is used to communicate between threads.
- WorkerData is used to pass startup data. A random js value that contains a clone of the data passed to this thread’s Worker constructor.
APIs
- const { worker, parentPort } = require(‘worker_threads’); This class(worker) represents an independent javascript execution thread, and the parentPort is an instance of the message port.
- new Worker(filename) OR new Worker(code, { eval: true }) These are the two ways of starting a worker thread (passing the filename or the code that you want to execute). It is advisable to use the filename in production.
- parentPort.on(‘message’), parentPort.postMessage(data) Messages sent by the parent thread using worker.postMessage() will be available in the parent thread using parentPort.on('message'), and Messages sent using parentPort.postMessage() will be available in the parent thread using worker.on('message'),
- worker.on(‘message’), worker/postMessage(data) This is used for sending messages and listening to messages between the different worker threads in js.
Node.js Worker Threads
When a Node.js process is executed, it runs:
- One process: It is the Node.js global object that can be accessed from anywhere and has details about what’s being executed at a time.
- One thread: Javascript being single-threaded, has only one main thread for the execution of tasks at a given time.
- One event loop: This plays an important role in the asynchronous nature and non-blocking I/O of Node.js, by offloading tasks to the system kernel whenever possible through promises, callbacks and async/ await.
- One JS Engine Instance: This is a computer program that executes js code.
- One Node.js Instance: This is a computer program that executes Node.js code.
In simple words, Node runs on one main thread, and there can only be a single task executing at a time in the event loop. However, there is a downside: if we have CPU-intensive code, like reading a file from the disk or any complex calculations in a large dataset, then this can block the main thread and hence another process from being executed.
A function is known as blocking if the main event loop must wait until it has finished executing the following command. A Non-blocking function will allow the main event loop to continue as soon as it begins and typically alerts the main loop once it has finished by calling a callback.
It’s important to differentiate between CPU operations and I/O (input/output) operations, as only I/O operations are run concurrently because they are executed asynchronously. So the main goal of Worker threads in js is to enhance the performance of CPU-intensive operations and not I/O operations.
Hence worker threads in js have the following properties:
- One process
- Multiple threads
- One event loop per thread
- One JS Engine Instance per thread
- One Node.js Instance per thread
Hence the worker threads have a single process and multiple threads. Generally, the number of threads created should be equal to the number of CPU cores in your system. Each worker thread is separated from the others. This means each worker thread runs its js code entirely in isolation from the other worker threads. V8 engine helps in creating this isolation. Due to this, each worker thread has its own copy of the V8 instance, event loop and Nodejs instance, which is independent from other worker threads in js. This results in the parallel execution of js code.
Creating and Executing New Workers
We will execute a loop with the 10 lakh iterations in this example. It is a CPU-intensive task, and this loop will block the main thread of our node.js application. If we want to prevent this blocking of the main thread, we will have to execute this loop operation on the worker thread .
We will separate the implementation of the main thread and the worker thread into two separate files that are main.js and worker.js .
- main.js : This file contains the code which is executed by the main thread, and also this file is used to create a worker thread.
- worker.js : This file contains the code which is run by the worker thread. The code inside this file will be CPU-intensive. In our case, we have taken a function timeTakingFunction that is assumed to be taking a considerably long time for its execution.
The code inside the main.js:
Worker threads objects are created with help of the worker class from the worker_thread module.
The worker constructor takes the following arguments.
filename: <string> | <URL> It is the path for the file which includes the code that should be executed by the worker thread. Hence we will give the path of the worker.js has to be given here.
Options: it is an object that contains many parameters.
- argv : It has the list of arguments which would be stringified and appended to process.argv in the worker.
- env : If set, specifies the initial value of process.env inside the Worker thread. As a special value, worker.SHARE_ENV may be used to specify that the parent thread and the child thread should share their environment variables.
- eval : If it is equal to true and the first parameter is a string, interpret the first parameter to the constructor as a script that is executed as soon as the worker is online.
- stdin : If it is equal to true, then worker.stdin provides a writable stream whose contents are seen as process.stdin inside the Worker. No data is provided by default.
- stdout : If it is equal to true, then worker.stdout is not piped through to process.stdout in the parent by default.
- stderr : If it is equal to true, then worker.stderr is not piped through to process.stderr in the parent by default.
- workerData : Any JavaScript value that is cloned and made available as require('node:worker_threads').workerData.
In the above example, we only used workerData, which is used to pass data from the main thread to the worker thread. WorkerData will be available as soon as the worker thread starts running. We are passing some value required in the worker.js file through workerData.
As we know, the main thread and the worker thread communicate through message channels. To accept the message from the worker thread, we have made the following listeners :
- message : It is an event initiated by the worker thread when it publishes the message to the main thread.
- error : This event is initiated when an error occurs during execution in the main thread.
- exit : This event is initiated when the worker thread has stopped. When we stop the worker thread with the help of worker.terminate(), the exit code becomes 1 and when we stop it with process.exit(), the exit code becomes 0.
Now let us handle the execution of CPU-intensive tasks in worker.js:
Over here, we have used the timeTakingFunction, a method that takes some required input valueRequired and takes a considerably long time to finish its execution. We have used worker data form the worker_thread module to receive data from the main thread. Through this object, we can access the value required parameter sent from the main thread. We have also used parentPort from the worker thread module so that the worker thread can post a message to the main thread.
Now to run the above file :
Output :
Using Worker Threads in JS for CPU Intensive Operations
Let us say we are building an app that allows users to upload a profile picture, and then you generate multiple sizes from the original image (e.g.: 50 x 50 or 100 x 100) for the different use cases within the app.
The procedure of resizing the image is CPU intensive, and having to resize it into different sizes would block the main thread. This task of resizing the image can be given to the worker thread, while the main thread handles other weightless tasks.
worker.js file :
main.js file :
Worker Threads in js are useful in these cases :
- Search algorithms.
- Sorting a large amount of data.
- Video Compression.
- Image Resizing.
- Factorization of Large Numbers.
- Generating primes in a given range.
Conclusion
- Web Workers (threads) are mainly used to perform CPU-intensive javascript tasks.
- Worker threads in js are moderately similar to Web Workers in the browser, and worker thread help in the parallel execution of js code.
- Workers thread is used to improve the performance of CPU-intensive, and they will not help much with I/O -intensive work.
- They can perform long-running tasks without affecting or blocking the main execution thread.
- Worker threads objects are created with help of the working class from the worker_thread module.
- Each worker thread is isolated from other threads. The V8 engine allows us to create isolated V8 runtimes.
- Each worker thread in js has its own copy of the V8 instance, event loop and Nodejs instance, which is independent of other worker threads in js.
- Transfer of memory from one thread to another happens with the help of ArrayBuffers.
- WorkerData is used to pass some data from the main thread to the worker thread.
- Worker threads are used for image resizing, video compression and sorting a large amount of data etc.