Create an NFS in AWS
Overview
Files are defined as data located on one computer or server. In the 1980s, scientists tried to discover a way to transfer files securely and quickly. So they invented the Network File System (NFS). The main purpose of the Network File System is to enable file sharing between two computers connected to the same network. Over the years, the development of NFS has played an important role in the storage industry's transformation in the 21st century. From startups to enterprise customers, corporate networks work based on the NFS protocol, which guarantees authorized personnel access.
What is the Network File System (NFS)?
There are many file system protocols available in the computing world. The Network File System is an internet standard protocol that is widely used by all industries to transfer files between storage devices. Storage devices include hard disks, solid-state disks, and tape drives. It was invented in 1984 by Sun Microsystem. Like many other protocols, the Network File System Protocol is also built on a remote procedure call.
Before we discuss this, we should know about distributed file systems and remote procedure calls.
What is a Remote Procedure Call?
In layman's terms, in distributed file computing, a "remote procedure call" is the process of inter-communication between the client and server. Here, the client and server refer to two different devices on the same network or a different network.
What is a Distributed File System?
Using the common file system protocol, distributed file systems enable people to share their files or data across any computer or network in any location around the world.
- Now back to the network file system. The main purpose of the network file system protocol is to enable the secure sharing of files between devices.
- By using an NFS client, customers can remotely view, store, and edit the file as if it were on their local system.
- The connection between the NFS client and server follows remote procedure calls, so any interruption in the device connection will result in data loss.
- NFS provides better performance in sharing, caching, and security since it is a widely accepted protocol following the Internet Engineering Task Force (IETF) infrastructure.
- There are different versions of the NFS protocol available; the most common one is version 3. It is often called NFS v3.
- The latest versions of NFS are NFS v4 and NFS v4.1.
How to Create an NFS File Share in AWS?
Step 1: Log in to the AWS management console, enter "storage gateway" in the search bar, and select Storage Gateway from the drop-down menu.
Step 2: In the AWS Storage Gateway console, select the file share option displayed in the left navigation panel as shown in the below image.
Step 3: Click the "Create File Share" button from the console.
Step 4: Here, select the existing Storage Gateway from the drop-down and proceed with the next selection.
Gateway: Choose an existing Amazon S3 file gateway from the drop-down list.
File Share Type: NFS in AWS
S3 bucket: Choose an existing one or create a new one by clicking the "create new s3 bucket" button next to the option.
-
The default configuration includes the following:
- S3 standard storage class.
- Without using a VPC, you can connect directly to an S3 bucket. (Note: You can't edit this setting after you finish creating your file share.)
- Access for any NFS client (no access control restrictions).
- IAM role created by Storage Gateway.
Step 5: Select the "Create File Share" button to create the file share.
Step 6: Now the NFS in AWS file share is created in the AWS environment using a storage gateway and S3 bucket.
If the customer uses a different operating system, they can test with the commands listed below.
Linux
Microsoft Windows
MacOS
Reference:
How to mount an NFS file share on the client:
Mount your AWS NFS file share on your client - AWS Storage Gateway (amazon.com)
NFS Configuration on AWS Environment
-
Customers have three options while configuring their NFS in AWS configuration in the AWS environment.
-
Create a shared NFS location in one EC2 instance and allow the other clients to access that EC2 instance.
-
Customers can achieve the NFS in AWS client-server architecture in the AWS environment by using the storage gateway and S3.
-
Amazon Elastic File System is the highly available file-sharing system managed by AWS. Customers can launch multiple EC2 instances and attach the EFS to the instances.
Requirement: Elastic file system and 2 EC2 instances.
Create an Elastic File System
Step 1: Go to the AWS EFS Dashboard and select the Create EFS button on the right navigation pane.
Step 2: Select the below items to create the EFS.
Name: Demo-EFS
VPC: Default
Storage Class: Standard
Step 3: Now that EFS has been created with three private IPs in three availability zones.
Create an EC2 Instance
Step 1: Log in to the AWS Console console and select the "Launch an EC2 instance" button.
Step 2: Select the below details and click to launch an instance.
Name: NFS-EC2.
Number of instances: 2.
AMI: Amazon Linux 2.
VPC: Default VPC or custom VPC.
Subnet: Public Subnet.
Security Group: Default Inbound and outbound.
Mount the EFS in EC2 Instances
Step 1: Mount via DNS using the NFS client.
Or
Mount via IP
Step 2: Connecting EFS and the EC2 instance
Once logged into instance 1, create the directory named efs using the mkdir command.
Step 3: Create an empty file named efs-demo-file in instance 1 and exit
Step 4: Log in to instance 2 and enter the below command to see the file available via EFS.
Troubleshooting the Connection
- Check whether the NFS protocol is allowed in the security group of EC2 instances for EFS.
- Check whether the EC2 instances and Elastic File System are in the same VPC and subnet.
- Check in the EC2 instance if there are any port restrictions at the firewall level.
Important Considerations While Configuring NFS Over EFS
Supported Operating Systems:
-
The Linux operating system is only supported by Amazon EFS. Amazon EFS is supported for all the Linux Amazon Machine Image (AMI) available in the AWS Marketplace. For the Windows operating system, Customers are using Amazon Fsx, which is the file share service for Windows Operating system-based EC2 instances.
Portability and Flexibility:
- Amazon Elastic File System can be mounted on the on-premise local drive. But to make that work properly, customers should make sure that they have a reliable and redundant VPN or Direct Connect connection to AWS from their on-premises network.
AWS Multi-AZ Features:
- AWS regions have a minimum of three availability zones within a region. So resources such as EC2 deployed in different availability zones will be able to access the elastic file share data if the Elastic File System is provisioned in standard mode.
Note: EFS has Standard and One Zone access modes. Standard mode provisions the EFS in three availability zones, whereas one zone mode provisions the EFS in one availability zone.
NFS Performance during EFS Backup:
- Since EFS can be attached to multiple EC2 instances, multiple EC2 instances will be able to access the data available in EFS simultaneously. EFS is a highly available and petabyte-scale serverless file storage service. If a customer initiates an EFS backup, the performance of all EC2 instances associated with EFS will suffer due to the throughput used for the backup operation.
Cost-effective:
- Amazon EFS is the most cost-effective file-based storage solution available among cloud service providers. The elastic file system is cheaper than Elastic Block Storage (EBS) volumes. One Elastic File System can be attached to any number of instances, from one to thousands.
Other recommendations by AWS while using EFS are mentioned below.
- AWS recommends that customers mount the NFS. through the DNS option. By doing so, the DNS name resolves in the availability zone of the EC2 instances.
- Additional costs will be incurred for the data transfer between the availability zones.
- Recommended Settings
Attribute | Value |
---|---|
rsize | 1048576 |
wsize | 1048576 |
timeo | 600 |
retrans | 2 |
- Amazon EFS does not support the following parameters:
- The "nconnect" mount option.
- The Kerberos security variants.
- Changing the default read or write buffer will result in reduced performance.
NFS File Share Use Cases
Data-Intensive Application:
- Data-intensive applications incorporated in distributed systems often use NFS to enhance and optimize the data read and write performances.
- Cache and throughput performance can be tailored to the capacity of the storage device.
Modern DevOps Uses:
- DevOps Pipeline necessitates a secure code repository and file management is critical to their business.
- Those codes can be stored in a secure place. Customers can share the code in the shared and secured network protocol, and only those who have access to that network will be able to read that file or code.
Content Management System:
- NFS-based file-sharing devices help customers increase performance at a lower cost in addition to persistent storage,
- OTT platforms like Netflix, Prime Video, and HBO Max have robust content management workloads. They are also using NFS as one of the protocols in their distributed application architecture.
Data Science:
- NFS plays an important role in the back end of machine learning and big data analytic workloads.
- Data science industries such as mathematics model generation and quantum network simulation require fast and persistent connections across devices. NFS file sharing enables them to compute and access the data faster.
Conclusion
- NFS is the protocol often used in the distributed file system.
- NFS V3 is the most common standard version protocol used by most companies.
- The NFS Client allows customers to connect to and access files on a remote server via the RPC protocol.
- There are three types of NFS support provided by AWS. They are NFS utilities that are built into EC2 instances, NFS file servers, and the Elastic File System.
- The elastic file system is the serverless NFS in the AWS solution, which means AWS will manage the availability and durability of the data and files.
- The NFS protocol plays an important role in various fields of cloud computing, such as data science and applied machine learning.