AWS Data Exchange
Overview
We all need different types of data to analyze, work on it and learn different things with that data. It is not easy to gather the data according to the need of the particular domain. Amazon Web Services came up with a service named AWS Data Exchange, that makes it easy for AWS customers to find and use third-party data in the AWS Cloud. With the help of datasets from AWS Data Exchange, we enhance data-driven decisions.
About AWS Data Exchange
AWS Data Exchange is a service that helps AWS users easily find, subscribe, and use third-party data in the AWS cloud.
- It provides access to data securely.
- It simplifies the datasets and easy-to-use to consume third-party data products.
- For a subscriber, it provides many products from different data providers.
- The Data Exchange console helps create, manage, and access the data sets across different AWS analytics services.
- For providers, it provides a secure and reliable medium to reach AWS customers. It helps the users to grant existing customers their subscriptions more efficiently.
AWS Data Exchange includes important terminologies, which help in to access it.
- Asset - It is a data object, i.e., a single file. An ID uniquely identifies every asset.
- Revision - When one or more assets are grouped, a revision is created. It is the point-in-time view/update of any data set. An ID uniquely identifies each revision.
- Data Set - When one or more revisions are grouped, a data set is created. It is the logical grouping of data for AWS users. An ID uniquely identifies each data set.
Data Exchange provides hundreds of data products from different third-party data providers. AWS users easily subscribe to the data and export it to Amazon S3. We can use other AWS services to analyze these data sets.
AWS Data Exchange Product
AWS Data Exchange Product is the unit of exchange published by a provider and later available for subscribers. When a provider publishes the product, it is reviewed by AWS guidelines and terms and conditions. Then, this product is listed on the AWS Marketplace and AWS Data Exchange product catalog. Each product is identified by its product ID.
A product has the following parts:
- Product Details
- Product details include the name of the product, descriptions, logo or image of the product, and contact information for this product.
- The providers complete it.
- Product Offers
- The Product offers to determine the terms and conditions agreed upon by the subscriber to get the product.
- Providers must define a public offer of the product to make it available on the console of the Data Exchange service.
- This Product offer contains prices and durations, an agreement of data subscription, and the option for custom offers.
- Data Sets
- A data set is a dynamic data set that is versioned through revisions.
- In a product, we have one or more data sets.
- The provider decides which revisions within a data set are published. The subscriber access these data sets when they subscribe to a product.
Malware Protection Facility in AWS Data Exchange
Amazon provides us with a very secure environment in all its services. In Data Exchange service, security and compliance are the user's and AWS's responsibility. Every S3 object uploaded by the provider is scanned for malware and viruses before they are made available to the subscribers. In case the scan detects a malware, the S3 object is removed to ensure a safe, secure, and trustworthy environment.
AWS Data Exchange provides different options to secure our data sets. They are:
- Encryption at Rest - This encryption encrypts the data products stored in the service at rest. This option applies by default to the data products whenever we use AWS Data Exchange.
- Encryption in Transit - In this encryption Transport Layer Security (TLS) and client-side encryption are used by AWS Data Exchange. HTTPS protocol is used for communication, so the data is always encrypted in transit. This encryption in transit is configured by default using the Data Exchange service of AWS.
- Restrict Access to Content - In AWS Data Exchange, we can restrict access to any appropriate subset of users. This is done using IAM users and group permissions. We can easily handle this by giving the proper permissions to specific users and groups.
Supported Data Sets in AWS Data Exchange
AWS Data Exchange gives a better approach to making data transactions easier. AWS promotes clarity with the datasets by using these services. It reviews the data types and products, then permitted. Those products that are not permitted are not going to publish and are restricted.
To publish any dataset, AWS sets many guidelines to review those datasets. AWS Data Exchange scans every uploaded object to ensure they follow the guidelines set by AWS. If any user suspects that an AWS Data Exchange product is being used illegally and for abusive purposes, they can report it. AWS provides a form named Report Amazon AWS abuse form for reporting abusive data.
How to Access AWS Data Exchange
We can access the AWS Data Exchange from two perspectives, one as a Subscriber and the other as a Provider.
For Subscribers
Subscribers are the ones who need the dataset for their need for their specific field.
The following options are available for subscribers to access the products which are part of AWS Data Exchange:
Data Exchange Console
- To access this console page, log in to the AWS account and search for AWS Data Exchange in the search bar. It will show a screen as given below.
2. Click on any region where it is supported to use it.
3. Click on Explore available data products. It will open a page where data products of various categories are listed. A subscriber just clicks on those products.
4. Product details and many other details appear on the next screen, as shown in fig.
5. Now, any subscriber can click on Continue to subscribe and use the data products as per their need.
AWS Marketplace Catalog
- The data products of AWS Data Exchange are available on the AWS marketplace. Go to the AWS marketplace home page- AWS Marketplace catalog
- Click on View all products. It will show data on products of various categories.
3. Subscribers can subscribe as per their requirements to any specific products.
For Providers
Providers are the ones who want to access the data products. We have to register as an AWS Marketplace Seller, and then an existing provider can access their products using the options listed below:
AWS Data Exchange Console(Publish Data)
A provider can directly access the AWS Data Exchange products using the data exchange console.
- On the Data Exchange console provided by AWS Management Console, click on Products under Publish data section.
- If any products are published and used by you, they will show up here. For the first time, it will show register as an AWS Marketplace Seller option, as shown in the above figure.
Programmatically
Providers can also access AWS Data Exchange by using the following APIs:
- AWS Data Exchange API - This API can be used to carry out CRUD (create, read, update, and delete) operations on the data sets and revisions. In addition, it can also be used for importing and exporting assets between revisions.
- AWS Marketplace Catalog API - The products present on the AWS Data Exchange products catalog and AWS Marketplace can be viewed or updated using this API.
Pricing
In AWS, Data Exchange charges depend on the actions performed by the users. So let's see the pricing of Data Subscribers and Data Providers.
Data Subscribers
For Data subscribers following are the fees and costs for them:
- Data Product Fees - These subscribers have two options based on product, i.e., Subscription-based or Pay-as-you-go. Some products follow a Subscription, whereas some follow a Pay-as-you-go pricing model.
- Data Transfer Fees - Transfer fees are applied when importing and exporting file-based assets across AWS regions. Amazon Simple Storage Service (S3) rates are applied for such transfer standards. Standard transfer rates to the internet apply when we use signed URLs to export file-based assets.
- AWS Service Costs - AWS charges for the services we use to store, process, or analyze the data products. Any other AWS services we use are billed according to the pricing plan.
Data Providers
For Data providers following are the fees:
-
Tiered Fulfillment Fees - AWS charges the tiered fulfillment fee for all new subscriptions to the data products. If you have subscribers on your products, you can create your subscription offers to migrate and fulfill subscriptions with AWS customers without additional cost.
-
Fees Associated with Data Products
-
Storage Fees for Amazon S3 Objects
- AWS charges storage fees for products containing Amazon S3 objects. These prices vary from region to region, for example, it costs a storage fee of $0.023 price/GB/per month for the US East (N.Virginia) region.
- For importing or exporting data files among AWS regions the standard pricing of Amazon S3 is applied.
-
Fees for Products Containing Amazon Redshift Data shares
- They charge Amazon Redshift fees for products with Amazon Redshift data shares. Charges for this depend on the standard pricing of Amazon Redshift.
-
API Gateway Fees for Products Containing API Datasets
- If the product includes API data sets, then AWS charges Amazon API Gateway fees based on the pricing of Amazon API Gateway.
-
AWS Data Exchange Supported Regions
There is a single globally available AWS Data Exchange product catalog. No matter to which region the subscriber belongs, they will see the same catalog.
Resources like datasets, which are part of the product, are available and managed regions for supported regions. Here is a list of regions that support AWS Data Exchange product resources.
- US East (N.Virginia)
- US East (Ohio)
- US West (Northern California)
- US West (Oregon)
- Asia Pacific (Seoul)
- Asia Pacific (Singapore)
- Asia Pacific (Sydney)
- Asia Pacific (Tokyo)
- Europe (Frankfurt)
- Europe (Ireland)
- Europe (London)
AWS Data Exchange-Related Services
Different Amazon Web Services are related to the Data Exchange service of AWS. Let's discuss those services. They are:
- Amazon S3 - Amazon S3 is an object storage service provided by AWS. It is one of the supported asset types for data sets by using its object snapshots. It helps the subscribers to export data sets programmatically.
- Amazon API Gateway - Amazon API Gateway helps to create, publish, maintain, monitor, and secure the APIs at any scale. It is also the supported asset type for data sets. Subscribers can call the API from the AWS Data Exchange console or programmatically.
- Amazon Redshift - Amazon Redshift is a simple and cost-effective service that analyzes data using business intelligence tools. AWS Data Exchange supports the data sets of Amazon Redshift. Subscribers only have read-only access to the data set of Amazon Redshift to query.
- AWS Marketplace - AWS Marketplace helps to find, buy, deploy, and manage third-party software and data to build solutions and run businesses. AWS Data Exchange allows the data sets to be published as a product on AWS Marketplace. The providers must register as a seller. The provider also uses the AWS Marketplace Management Portal or the AWS Marketplace Catalog API to publish their products.
Conclusion
In this article, we explored the following points:
- AWS Data Exchange provides data products per the required categories for the Subscribers.
- Different providers published their products on AWS Data Exchange for the Subscribers.
- Data Exchange service by AWS provides secure and malware-protection data sets.
- This service supports the data sets in Amazon S3 snapshot as an API.
- It also has the ability to support Amazon Redshift datasets.
- This service makes it simple for customers to find, subscribe, and use third-party data in the AWS Cloud.