Transmission Control Protocol (TCP)
Overview
The transmission control protocol is a transport layer connection-oriented protocol that defines the standard of establishing and maintaining the conversation that will be used by the applications to exchange the data. The transmission control protocol is one of the most important and widely used protocols of the IP suite. The TCP ensures the reliable transmission and delivery of our data packets. The transmission control protocol can deal with the various issues that can occur in the data transmission such as packet duplication, packet corruption, packet disordering, packet loss, etc.
What is the Transmission Control Protocol (TCP)?
Before learning about the transmission control protocol, let us first learn about computer networks, the OSI model, and the transport layer.
A computer network is a set of devices (computers) connected to exchange information and resources such as files etc. The main goals of computer networks are:
- Resource sharing (such as software sharing, program sharing, etc.)
- High Reliability (if one network link fails another can transfer the data).
- Cost Reduction (we can buy only required services from cloud services like GCloud, AWS, Azure, etc.).
- Communication (acts as a communication medium between sender and receiver).
- Load Sharing (a program may run on multiple machines).
The OSI model stands for Open Systems Interconnection model. The OSI model is also known as the ISO-OSI model as it was developed by ISO (International Organization for Standardization). It is a conceptual reference model that describes the entire flow of information from one computer to the other computer. The OSI model is a 7-layered model so it is also known as a 7-layered architecture model. The basic idea behind the layered architecture is to divide the design into smaller pieces. To reduce the design complexity, most networks are organized in a series of layers. The network layer and transport layer are two of the seven layers of the OSI model.
Now, let us discuss the transport layer briefly so that we can get a better understanding of the transmission control protocol which is one of the most widely used protocols of the transport layer.
The transport layer is the fourth layer of the OSI model which is responsible for the process to process delivery of data. The main aim of the transport layer is to maintain the order so that the data must be received in the same sequence as it was sent by the sender. The transport layer provides two types of services namely - connection-oriented and connection-less.
The functions provided by the transport layer are as follows:
- The transport layer maintains the order of data.
- It receives the data from the upper layer and converts it into smaller parts known as segments.
- One of the major tasks of the transport layer is to add the port addressing (addition of a port number to the header of the data). The port number is added so that the data can be sent at the respective process only.
- The transport layer on the receiver's end reassembles the segments to form the actual data.
- The transport layer also deals with flow control and error control.
Refer to the image below to see the basic transmission of data and working of the transport layer.
The various protocols used in this layer are:
- TCP (Transmission Control Protocol),
- UDP (User Datagram Protocol), etc.
The various devices used in this layer are:
- Segments,
- Load Balancers/Firewalls, etc.
Let us now learn about the transmission control protocol or TCP.
The transmission control protocol is a transport layer connection-oriented protocol that defines the standard of establishing and maintaining the conversation (or connection) that will be used by the applications to exchange the data. The transmission control protocol is one of the most important and widely used protocols of the IP suite.
Note:
- There are two types of connection namely connection-oriented and connection-less protocol. In the connection-oriented protocol, we first need to connect to the receiver before sending our data.
- The Internet Protocol Suite is the standard network model and stack of communication protocols that are used on the Internet.
One of the prime reasons for using the transmission control protocol over the other protocol(s) like UDP is that the TCP ensures the reliable transmission and delivery of our data packets. The transmission control protocol can deal with the various issues that can occur in the data transmission such as packet duplication, packet corruption, packet disordering, packet loss, etc.
The transmission control protocol is used with the internet protocols such as IPV4, IPV6, ICMP, etc. Let us now learn the working of the transmission control protocol.
Working of TCP
The transmission control protocol divides the data into the form of smaller bundles known as packets and then assigns some numbering to these packets. Finally, it transmits the packets of data to the receiver end. As we have discussed earlier, the transmission control protocol is connection-oriented. So, it needs to establish a connection before sending the packets.
So, the entire process of establishing a connection, sending data packets, and then removing the connection comes under the working of the transmission control protocol. Let us learn the process in detail with a diagram.
Step 1: Establishing the connection
Whenever two computer systems want to exchange data using the transmission control protocol, they first establish a three-way handshake connection. The three-way handshake connection is used to create a connection between the host or client and the server. As the name suggests, it is a three-step process in which first the client (wants to establish a connection) sends an SYN segment (Synchronize Sequence Number segment) which tells the server that the client wants to start the communication. In the second step, the server responds with an SYN-ACK signal (SYN Acknowledgement). The SYN-ACK signifies the server has received the client's request to establish the connection. In the third and the last step, the client again sends the ACK signal to the server and they both establish a reliable connection that will be used to transfer the data packets.
The three-way handshake is also known as SYN-SYN-ACK. Refer to the diagram below for more clarity.
Step 2: Sending of data packets
In the second step, the data packets along with the sequence number are sent from the first computer system (client). The second computer (server) responds to these sent packets by sending an acknowledgment or ACK. This acknowledgment bit keeps on increasing with the number of packets sent. This ACK bit helps to keep track of three things:
- the successfully received packets,
- the lost packets, and
- the packets that were accidentally sent twice.
Step 3: Closing the connection
As we have discussed the client initiates the connection with the server by sending a SYN. But in case of closing the connection, either the server or the client can close the connection. The first computer system (either the server or the client) initiates the closing of the connection by sending a packet with a FIN bit or finish bit attached to it. The other computer sends back or responds with an ACK bit. Finally, the first computer sends an ACK bit back to the second computer, and the connection gets closed.
Note: The TCP breaks the data in the form of packets so that the entire message can reach the target location intact, the TCP at the destination end reassembles the packets into the original message or data.
Features of TCP
Let us now discuss the various features of the transmission control protocol.
Connection-oriented
The transmission control protocol is connection-oriented, hence the data is only transferred after establishing a secure connection.
Reliable
The transmission control protocol offers reliable data transfer. Using the SYN and ACK bits both the computers can keep track of the sent and lost packets. By knowing the lost packets, the sender can again send the lost packets to the receiver. In this way, all the data packets get transferred to the receiving end.
Flow Control
The transmission control protocol provides the mechanism of flow control. Using the flow control mechanism, the TCP controls and limits the rate of data transfer from the sender's end. The receiving end (receiver) continuously hints the rate at which it can receive the data and the TCP uses the hints to adjust the data transfer rate of the sender.
The data transfer rate or flow control is maintained so that the data packets do not get lost.
Error Control
The transmission control protocol provides the mechanism of error-checking and error recovery. Using the error control mechanism, the TCP keeps track of the erroneous packet so that it can be sent again from the sender. The error mechanism also involves some techniques used at the receiver end to correct the errors (error recovery).
Sequencing of packets
The data packets are sent along with the sequence number from the sender so that it can keep track of the packets. The receiver also sends back an ACK bit which tells the sender whether the specific packet was received correctly or not.
Congestion Control
Congestion means the amount of data transferred in the network. So, the transmission control protocol also accounts for the level of congestion and sends the packets accordingly.
Stream-oriented data transfer The transmission control protocol creates a virtual circuit or tube in which the data is exchanged in the form of a stream of bytes.
Full duplex
The transmission control protocol is a full-duplex which means the data can be transferred in both directions simultaneously (at the same time).
TCP Header
Let us now discuss one of the most important sections i.e. the header of the transmission control protocol.
The header of the transmission control protocol is of minimum 20 bytes and a maximum of 60 bytes. Refer to the diagram below to visualize the header and the various header fields of the transmission control protocol.
Let us briefly discuss the various fields of the transmission control protocol:
-
Source Port: As the name suggests, the source port number defines the port address of the source. The source port of 16 bits.
-
Destination Port: As the name suggests, the destination port number defines the port address of the destination. The destination port is also of 16 bits.
-
Sequence Number: It contains the sequence number of a segment or data bytes in a session.
-
Acknowledgment Number: The ACK flags contain the next sequence number of the segment of data and it works as an acknowledgment for the previously received data.
Let us take an example to understand the sequence number and acknowledgment number better. Let us suppose the receiver receives the segment number x, then it responds to the acknowledgment number x+1 to the sender.
-
HLEN: HLEN stands for Header Length. It has 4-bit value. Using 4 bits, it specifies the length of the header. The minimum value for HLEN is 5 and maximum is 15. The length of the header is calculated by multiplying HLEN value by 4. Since, the minimum value of HLEN is 5 and maximum 15, so, minimum header length is 20 bytes and maximum header length is 60 bytes.
-
Reserved: It is reserved for any future use and it is of 6-bits.
-
Flags: The flag is a 1-bit value. Flags are of mainly 6 types and are also known as control bits. Let us discuss the 6 types of flags:
- URG: URG represents an urgent pointer, so, if the URG value is set to 1 then the data is processed urgently.
- ACK: ACK denotes acknowledgment, so, if its bit value is set to 0 then the data packet does not contain an acknowledgment.
- PSH: PSH represents PUSH, so, if the PSH bit is set to 1 then the receiver is requested to push the data without buffering it.
- RST: RST stands for restart, so if its bit value is set to 1 then the connection needs to be restarted.
- SYN: SYN bit as discussed earlier is used to establish a connection.
- FIN: FIN represents finish, so, if its value is set to 1 then the connection is to be closed.
-
Window Size: The window size is a 16-bit field that denotes the data size that the receiver can accept. As it contains the data size of the receiver, it helps in the flow control mechanism. The window size field is determined by the receiver only.
-
Checksum: The checksum is also a 16-bit field that contains the checksum of the header, and the data.
-
Urgent Pointer: If the value of URG is set to 1 then the urgent Pointer field points to the urgent data byte.
-
Options and Padding: As the name suggests, the option field contains options that are not in the regular header. The options field is described as a 40-bit word. Padding is used in the case when the data in the option field is less than 40-bit. So, padding (adding extra 0 bits) helps to make the option bit reach the 40-bit word boundary.
TCP Segment Structure
As discussed above, the TCP segment or data packet contains two things Header field and Data Field.
The data field contains the actual data that has to be transmitted to the destination. The range of the header is between 20 bytes to 60 bytes and it contains 10 fields denoting having different work. Refer to the above section to learn more about the TCP header segment structure.
TCP Connection Management
The TCP connection management is a three-step process :
- Connection establishment.
- Data transfer.
- Closing the connection.
Whenever two computer systems want to exchange data using the transmission control protocol, they first establish a connection. In the second step, the data packets along with the sequence number are sent. In the third and final step, the established connection is closed.
How Does TCP Ensure Reliable Data Transfer?
The TCP uses SYN, SYN-ACK, and ACK bits to ensure reliable data transfer. The transmission control protocol also offers an error control and flow control mechanism to ensure that the data is transferred at an adequate rate with less or no loss.
Let us discuss two major aspects of reliable data transfer i.e.
- how does the TCP handle and detect the lost packets? and
- how does the TCP handle the order of the packets?
Detecting Lost Packets
If the sender receives three duplicate acknowledgments or the time period of retransmission is expired then the sender knows that the packet has been lost. On the loss of every data packet, the sender treats it as an indication of network congestion.
Handling Out-Of-Order Packets
For handling the order of the packets at the receiver and transmission end, the sequence number and the ACK number are used. The sequence number helps to combine the data correctly at the receiver's end.
Conclusion
- The transmission control protocol is a transport layer connection-oriented protocol that defines the standard of establishing and maintaining the conversation that will be used by the applications to exchange the data.
- The TCP ensures the reliable transmission and delivery of our data packets.
- The TCP breaks the data into packets so that the entire message can reach the target location intact, the TCP at the destination end reassembles the packets into the original message.
- Whenever two computer systems want to exchange data using the TCP, they first establish a connection, and then the data packets along with the sequence number are sent. Finally, the established connection is closed.
- The transmission control protocol can deal with the various issues that can occur in the data transmission such as packet duplication, packet corruption, packet disordering, packet loss, etc.
- The transmission control protocol is used with the internet protocols such as IPv4, IPv6, ICMP, etc.
- The TCP segment or data packet contains two things: Header field and Data Field. The header of the transmission control protocol is of minimum 20 bytes and a maximum of 60 bytes.
- The transmission control protocol also offers an error control and flow control mechanism to ensure that the data is transferred at an adequate rate with less or no loss.