P2P(Peer To Peer) File Sharing
Overview
Peer-to-peer has decentralized the simplest form of network architecture where every computer system can communicate with every other computer system. In the peer-to-peer network, each node of the network has equal permission and responsibility for processing the data or information. Each computer network in the peer-to-peer computer network architecture acts as an independent workstation and maintains its security. The main focus of the peer-to-peer network model is on the connectivity among the computer systems.
Introduction
Before learning about p2p file sharing, and peer-to-peer network, let us first learn about computer networks in brief.
A computer network is a set of devices (computers) connected to exchange information and resources such as files etc. The main goals of computer networks are:
- Resource sharing (such as software sharing, program sharing, etc.)
- High Reliability (if one network link fails another can transfer the data).
- Cost Reduction (we can buy only required services from cloud services like GCloud, AWS, Azure, etc.).
- Communication (acts as a communication medium between sender and receiver).
- Load Sharing (a program may run on multiple machines).
There are mainly two types of computer network architecture:
- peer-to-peer network architecture or P2P network architecture.
- Client-Server networks architecture.
Peer-to-peer has decentralized the simplest form of network architecture where every computer system (node) can communicate with every other computer system (node). In the peer-to-peer network architecture, there is no use of a centralized server as every computer system can communicate with every other computer system directly.
In the peer-to-peer network, each node of the network has equal permission and responsibility for processing the data or information. Each computer network in the peer-to-peer computer network architecture acts as an independent workstation and maintains its security. Every computer system stores data on its disk and can share the data with the rest of the computer systems present in the network. So, we can say that in a peer-to-peer network, each computer system can act as both a server and a client. Hence, each computer network can request service(s) and can also provide the services. The main focus of the peer-to-peer network model is on the connectivity among the computer systems.
:::{.tip} Note: A Client is a computer system that accesses the services provided by a server. On the other hand, a server is a powerful centralized hub that stores various information and handles the requests of the client(s).
Refer to the image below to see the basic overview of a peer-to-peer network system architecture.
In a small-scale network (consisting of a lesser number of computer systems), it is easier to manage communication due to direct interconnections between the computer systems. Hence, we can set up a peer-to-peer network architecture at the home. The peer-to-peer network architecture is also proffered by SMBs (or small-scale businesses) where there is no need for a centralized server.
:::{.tip} Note: The early internet was based on the peer-to-peer network architecture hence all computer systems were given equal privileges and most of the interactions were bi-directional.
As discussed earlier, the peer-to-peer network model is suitable for a smaller set of computer systems.
Some of the areas where the peer-to-peer network model is used:
- Napster (1999) which is an audio streaming application developed by MelodyVR used the peer-to-peer network model to upload, download, and exchange music via file-sharing programs.
- The Windows 10 updates were also transferred using the peer-to-peer network model and Microsoft's server.
- Some online gaming platforms also use the peer-to-peer network model.
- Games like StarCraft II, Diablo III, and World of Warcraft are distributed using the peer-to-peer network model.
- The peer-to-peer network architecture is also used in the field of blockchain.
How P2P Works?
Let us now learn how the P2P network works. In the P2P network, there is no need for a client, it allows the computer nodes to communicate with one another without the server. In the P2P network, when two devices are communicating, one device acts as a client and the opposite one becomes the server and vice-versa.
Whenever one peer requests the network, the request may be provided to multiple peers connected to the network. Hence, a request copy can be present at various peers of the network. Now, this may arise a problem as we need to find a way to find the IP address of all the peers who have a copy of the request. This problem is solved by the architecture of the P2P system, there are three types of P2P system architecture (we will be learning about them in detail in the later section). So, with the help of:
- Centralized Directory
- Query Flooding
- Exploiting Heterogeneity, networking strategies, each of the nodes or peers get to know about the destination peer and that is how the file transfer, data transfer, and other communication takes place directly between the two peers.
Different P2P Architectures
Let us now learn about the various P2P architectures in detail and see how they solve the problem of peer identification.
1. Centralized Directory
The centralized directory network architecture is very similar to the client-server architecture because in the centralized directory network we have a central server which is pretty huge and it provides directory services to the peers.
In this network all the peers tell the IP address to the central server, they also tell about all the files that are available for sharing. The centralized directory regularly sends queries to the peers (in regular intervals) to check whether the peers are connected to the central server or not. In simpler terms, we can say that the central server maintains a huge database of all the files present in the network along with the IP addresses of the connected peers.
Let us now briefly discuss the working of the centralized directory. Whenever we want to send a request to a peer, the request is first sent to the central server as a query. As we have previously discussed that the central server has all the information regarding the peers so the central server returns the IP address of the required peers. Now the sender peer can directly transfer the files or the data to the desired peer.
This centralized directory network system was first used by Napster as it was used for sharing MP3 files.
Please refer to the image provided below for more clarity.
One of the major issues with this approach is that there is a need for a distributed system. In this centralized directory approach the peers have to be connected to the overlay network, so whenever one peer gets disconnected then the central server has to be informed and the entire data needs to be stored again.
2. Query Flooding
The Query Flooding approach is different from the central approach as this method does not use the distributed system. In this system, the connection is not between the central system and the peer but the connection is actually between the peers. In this network, peers are known as nodes and the connection between the nodes or the peers is known as Edge. This Node and Edge structure results in a graph.
Let us now briefly discuss the working of query flooding network system and how it solves the problem of peer identification.
Whenever a peer wants to send a request to another peer the request is transferred to all the neighboring peers or nodes that are connected to the sender node. If the connected node doesn't have the necessary files then they transfer the request to their connected node. We can see that the request query is being transferred to the neighbors like floods so this network system is known as the Query Flooding network system.
Whenever a peer has the requested file the flood of the query stops and the file is sent to the client in the reverse manner of the request.
The first query flooding or decentralized peer-to-peer network was Gnutella.
Please refer to the image provided below for more clarity.
This method also has one disadvantage, the query has to be sent to all the neighbors until the request matches. So this network system has increased traffic network.
3. Exploiting Heterogeneity
Exploiting Heterogeneity uses the advantages of both the above-discussed system. Exploiting Heterogeneity system behaves like a distributed system but there is no central server. This system does not treat all the connected peers equally but the peers having higher bandwidth and higher network connectivity are provided higher priority. These higher-priority peers are treated as group leaders or super nodes.
The other less-priority peers are assigned to the supernodes. These super nodes are interconnected and they inform their respective leaders about the connectivity, the files, and the IP addresses available for transmission or sharing.
Let us now briefly discuss the working of Exploiting Heterogeneity network system and how it solves the problem of peer identification.
In this system, queries are processed in two ways.
- In the first process all the super nodes can contact the other super nodes and merge their databases thus making a super node have information about a large number of peers.
- The second process is that whenever the query is encountered, the query is transferred to the connected super nodes until the requested match is found. In this approach there is still some query flooding but not like the previous approach since there is a limited number of super nodes.
Please refer to the image provided below for more clarity.
KaZaA technology uses the Exploiting Heterogeneity approach of the peer-to-peer network.
Risks of P2P
Let us now discuss some of the risks or limitations of the peer-to-peer network.
- The peer-to-peer network architecture is not very secure, the data and other shared resources can be easily discovered and used by unauthorized users.
- Since each computer system acts as an independent server and client, the respective computer system user must be trained to perform an administrative task.
- As there is no centralized server in the network, there cannot be a central backup of files and folders.
- Each computer system has its anti-virus scanner and backup schedule so there is an extra cost for this software (for each computer system).
- The peer-to-peer network model is mainly suitable for small-scale businesses and houses.
- The P2P network model becomes less stable if the number of the peer (computer system) increases.
Client-Server Model vs. P2P Model
Before getting into the difference between the client-server model and the p2p model, let us now briefly learn about the client-server model.
A client is a computer system that accesses the services provided by a server. On the other hand, a server is a powerful centralized hub that stores various information and handles the requests of the client(s).
The client-server network model is one of the most widely used networking models. In the client-server network, the files are not stored on the hard drive of each computer system. Instead, the files are centrally stored and backed up on a specialized computer known as a server. Here, a server is designed to efficiently provide data to a remote client. On a large-scale network, there can be more than one server. Let us discuss the various type of servers:
- File Server: A file server is used to transfer files to the client(s).
- Email Server: An email server is used to deal with the internal email system.
- Web Server: A web server is used to control access to the internet and block any unsuitable websites.
- Print Server: A print server is used to deal with all of the printing requests from the client(s).
In a client-server network, there is a specific server and specific clients connected to the server. Refer to the image below to see the basic overview of a client-server network system architecture.
The central computer system or the server is used to provide communication and resource sharing between other computer systems present on the network which are known as clients. A client does not share any of its resources, but it requests data or services from a server. The main focus of the client-server network model is on data sharing.
:::{.tip} Note:
- The client-server model is also known as the networking computing model as all the services and requests are delivered on the network.
- A system administrator is responsible to manage the data present on the server.
A server is always ON so client machines can access the files and resources without caring whether the server computer system is ON or not. One of the major drawbacks of the client-server model is that if the server is turned OFF (due to any certain reason), the resources present on the server will not be available to the clients.
As we have discussed the peer-to-peer network model and client-server network mode, let us now discuss the difference between the peer-to-peer and client-server network models.
Peer-to-peer Model | Client-server Model |
---|---|
The peer-to-peer network model is distributed and decentralized in nature. | The client-server network model is also distributed in nature but it is centralized. |
The main focus of the peer-to-peer network model is on the connectivity among the computer systems. | The main focus of the client-server network model is on data sharing. |
In the peer-to-peer network model, each computer system can act as a client and a server. | In the client-server network model, there is a centralized server. |
In the P2P network model, each computer system stores individual files and data. | In the client-server network model, there is a central backup of the files and data. |
It is more reliable than the client-server model as in case of failure of one system, the entire network does not get affected. | The client-server network model entirely depends upon the central server, so in case of server failure, the entire network gets affected. |
The P2P network model is more reliable. | The client-server model is more robust. |
In the P2P network model, the response time is low as the computer systems are directly connected. | In the client-server model, the access time may be slow if there are several requests made on the server. |
The P2P network model is cheaper as there is no need to implement the centralized server. | The client-server network model is costlier as there is a need for implementation of the centralized server. |
The P2P network model is less secure as there is no need for authentication before communication. | The client-server network model is more secure as every device needs authentication before communication. |
The P2P network model becomes less stable if the number of the peer (computer system) increases. | The client-server network model is more stable than the P2P network model. |
The P2P network model is suitable for small-scale businesses and houses. | The client-server network model is suitable for both small-scale businesses and large networks. |
Areas where the P2P network model is used: in Napster (audio streaming platform), in rolling the Windows 10 updates, and in the distribution of games (such as - StarCraft II, Diablo III, and World of Warcraft). | Areas where the client-server network model is used: in web browsers for requesting webpages, in database servers for accessing query results, in mail servers, etc. |
:::
Additional Resources
We have a lot more content on the peer-to-peer network, server, client, their connection, OSI model, TCP/IP model, and whatnot. Please refer to the links provided below to learn more content related to computer networking on Scaler Topics!
Conclusion
- Peer-to-peer has decentralized the simplest form of network architecture where every computer system can communicate with every other computer system.
- The main focus of the peer-to-peer network model is on the connectivity among the computer systems.
- In the peer-to-peer network architecture, there is no use of a centralized server as every computer system can communicate with every other computer system directly.
- In the P2P network, when two devices are communicating, once device acts as a client and the opposite one becomes the server and vice-versa.
- The centralized directory network architecture is similar to the client-server architecture as we have a central huge server that provides directory services to the peers.
- In the Query Flooding approach, the connection is not between the central system and the peer but the connection is actually between the peers.
- Exploiting Heterogeneity system behaves like a distributed system but there is no central server.
- The Exploiting Heterogeneity system does not treat all the connected peers equally but the peers having higher bandwidth and network connectivity are provided higher priority.
- The peer-to-peer network architecture is not very secure, the data and other shared resources can be easily discovered and used by unauthorized users.
- Since each computer system acts as an independent server and client, the respective computer system user must be trained to perform an administrative task.