Implementing Checksum Using Java
Overview
A checksum is one of the techniques that is used to detect error(s) in the message. We generally use the checksum in the network layer as well as in the transport layer of the TCP/IP protocol suite for error detection. The procedure or the process that generated the checksum of given data input is called the checksum function or checksum algorithm. A checksum helps us in detecting whether a complete message is received or not. The checksum can detect the integrity of our data but it cannot detect the authenticity of the data.
Introduction
Before learning about implementing checksum using Java, let us get a brief introduction to Java and the TCP/IP model as both of them are a prerequisite for learning about the checksum.
Java is one of the most popular high-level programming languages that was developed in the year 1995. Java is currently owned by Oracle. We can use Java to develop mobile applications, web applications, desktop applications, games, web servers, and much more. Java is a platform-independent language which makes it very useful. Java is fast, secure, and a very powerful programming language. Just like C++, and Python, Java is also an object-oriented programming language.
The transmission control protocol is a** transport layer connection-oriented protocol** that defined the standard of establishing and maintaining the conversation (or connection) that will be used by the applications to exchange the data. The transmission control protocol is one of the most important and widely used protocols of the IP suite. What is meant by Internet Protocol Suite? Well, The IP Suite or the Internet Protocol Suite is the standard network model and stack of communication protocols that are used on the Internet. Hence, for the data transmission in the communication network, we use the transmission control protocol.
One of the prime reasons for using the transmission control protocol over other protocol(s) like UDP is that the TCP ensures the reliable transmission and delivery of our data packets. The transmission control protocol can deal with the various issues that can occur in data transmission such as packet duplication, packet corruption, packet disordering, packet loss, etc. The transmission control protocol is used with the internet protocols such as IPV4, IPV6, ICMP, etc. Let us now learn the working of the transmission control protocol.
Let us now learn about checksum in Java. A checksum is one of the techniques that is used to detect error(s) in the message. We generally use the checksum in the network layer as well as in the transport layer of the TCP/IP protocol suite for error detection. In these layers, the original data is added with headers, and trailer and then transferred. We also perform encoding and decoding of the data package. Now there is a chance of data loss or changing of the data bit. So, to deal with such scenarios, we use the checksum bit. In this article, we will learn about implementing checksum using Java.
Hence, a complete message that is received or not can be easily detected using a checksum.
A checksum is applied both at the sending end and at the receiving end to check if the data is correct or not. Now, the checksum can detect the integrity of our data but it cannot detect the a**uthenticity of the data**.
Checksums and Common Algorithms
A checksum is the representation of the binary stream of data in a very minimal manner. The procedure or the process that generated the checksum of given data input is called the checksum function or checksum algorithm. We have several algorithms that are commonly used for creating a checksum for example Adler32 and CRC32 algorithms. The algorithms convert a sequence of data into a smaller sequence of data having numbers and letters so that the transmission can be easier.
Now the algorithms are designed in such a way that if we change the data input a little, then the resultant checksum output data is changed vastly.
We should be familiar with the basic concepts of Socket programming. Please refer to the article: Socket Programming in Computer Network to learn more about the concepts of Socket Programming.
How Can We Generate a Checksum for the String or Byte Array Data
To generate a checksum for the string or byte array data, we first need to get the data input into the checksum algorithm i.e. CRC32. For the string type data, we have a function called getBytes() that can be used to get a byte array data from a string. This method loads every byte of the data into the memory hence it takes memory.
Example:
Now as we have the data in the form of a byte array, we can use the function of the Checksum class like update and getValue to calculate the checksum of the data.
Example:
How Can We Generate a Checksum from an InputStream?
If we have to deal with a large data set then the above method cannot be beneficial as it will take a lot of memory for generating the byte array. In this method rather than using the Checksum class, we can use the CheckedInputStream class as we have to deal with InputStream. We can define the number of bytes to be processed at a single time using the current method.
Example:
Implementation of Checksum in Java
As we have discussed above that a checksum is nothing but a small-sized datum (a single symbol of data or unary data) that is a part of the block of digital data which can be used to detect the errors introduced in the transmission or storage of the data.
In Java, we use and support the CRC32 algorithm but we should not use this algorithm for secure operations like hashing a password, etc.
Let us now look at the code for implementing checksum using Java.
Output:
Please refer to the next section for more explanation.
Explanation
In the above code, we have a Test class and first, we are getting the input string to be converted into checksum. We have a function called generateCheckSum() for converting the input into a checksum. We are calling the function and passing the input string. Similarly, we are the data to be sent to the receiver and the checksum to be sent to the receiver.
Now, we are passing the generated checksum and input string to the receive() function. Let us now look at the functioning of the generateCheckSum() function and the receive() function.
"generateCheckSum(string)" Function
In this function, we are first initializing an empty hexadecimal string for storing the checksum. Now, we will be iterating through the input data string and for each character of the string, we are generating a checksum value and appending it to the resultant checksum.
At last, we are generating a complement for the checksum and returning the generated checksum.
"receive(sting, int)" Function
The receive() function will call the generateCheckSum() function and using the checksum it will check if the syndrome is error-free or not.
How can we know whether the message received at the receiving end is correct or not? Well, we can use the syndrome itself. If the syndrome's value is 0 then the correct value is received else we have received an incorrect message.
We have another function called generateComplement(). This function is used for generating complement by adding FFFF to the input checksum and then parsing the entire data into an integer.
Conclusion
- A checksum is one of the techniques that is used to detect error(s) in the message. The checksum can detect the integrity of our data but it cannot detect the authenticity of the data.
- We generally use the checksum in the network layer as well as in the transport layer of the TCP/IP protocol suite for error detection.
- We can apply the checksum both at the sending end and at the receiving end to check if the data is correct or not.
- The procedure or the process that generated the checksum of given data input is called the checksum function or checksum algorithm.
- The checksum algorithm converts a sequence of data into a smaller sequence of data having numbers and letters so that the transmission can be easier.
- We have several algorithms that are commonly used for creating a checksum for example Adler32 and CRC32 algorithms.
- In Java, we use and support the CRC32 algorithm but we should not use this algorithm for secure operations like hashing a password, etc.
- To generate a checksum for the string or byte array data, we use the checksum algorithm i.e. CRC32.
- If we have to deal with a large data set then the byte array method cannot be beneficial as it will take a lot of memory for generating the byte array.