Indexes (DynamoDB)

Topics Covered

Overview

An index is a type of data structure that permits quick searches on various table columns. The database manages an index when we create it. The index is updated automatically whenever data in the table is changed to reflect those changes. A secondary index can be built and used to speed up queries. Once the secondary index has been constructed, we can use operations like Query and Scan on it exactly like we would on a table. Two distinct index types are supported by DynamoDB.

What are the Indexes in DynamoDB?

The indexes used by DynamoDB are distinct from those used by relational databases. A partition key and a sort key are two of the primary properties that must be specified when creating a secondary index. The secondary index can be queried or scanned once it has been created, exactly like a table. The only times a secondary index is used are when you Query it or Scan it because DynamoDB lacks a query optimizer.

Indexes of two different types are supported by DynamoDB:

  • Global Secondary Indexes:
    Any two characteristics from its table may be used as the index's main key.
  • Local Secondary Indexes:
    The index's partition key and the table it belongs to must match. The sort key, however, can contain any other characteristic.

For primary key characteristics, DynamoDB employs indexes to facilitate access. They enable greater performance by decreasing application lag and speeding up data retrieval and application access.

Because AWS DynamoDB is a No SQL database, it does not support SELECT queries with conditions like the following query.

It should be noted that while running the aforementioned query against a SQL database, a query optimizer checks to see whether any of the available indexes can satisfy the query.

You can use the DynamoDB scan operation to get the same query result. Scan operations, on the other hand, examine all items in a table, which takes longer than query operations, which only access objects at particular indices. Imagine having to search through all the volumes in a library to find a book instead of knowing exactly which shelf it is located on.

In order to map a subset of the attributes from this base table, it is necessary to use a different table or data structure that holds data with a separate primary key. AWS DynamoDB is in charge of this additional table, which is referred to as a secondary index. Secondary indexes that are related to the base table will be updated if items are added, changed, or removed to reflect the changes.

Types of Indexes supported by DynamoDB

Global Secondary Index (GSI) and Local Secondary Index(LSI) are the two index types that AWS DynamoDB offers.

Local Secondary Index

A local secondary index differs from the base table's sort key but must share the same partition key. Because each partition of a local secondary index is constrained by the identical partition key value of the base table, it is referred to as being "local." It allows for data queries that use the supplied sort key attribute in an alternative sorting order.

add local secondary index

Global Secondary Index

An index with a partition key and an optional sort key that are distinct from the primary key of the underlying table is referred to as a global secondary index. Since queries on the index can access the data from many base table partitions, it is referred to as being global. It can be thought of as a separate table with attributes derived from the underlying table.

add global secondary index

Difference Between Characteristics of Local and Global Secondary Indexes

The primary distinctions between a local secondary index and a global secondary index are shown in the following table.

CharacteristicGlobal secondary indexLocal secondary index
Key SchemaOne of two types of primary keys—simple (partition key) or composite—can be used for a global secondary index (partition key and sort key).A composite primary key is required for a local secondary index (partition key and sort key).
Key AttributesAny properties from the base table that are of type string, integer, or binary can be used as the index partition key and sort key.The base table's partition key and the index's partition key have the same property. Any attribute from a base table that is a text, numeric, or binary can be used as the sort key.
Size Restrictions Per Partition Key ValueGlobal secondary indices are not size constrained.All indexed objects must have a combined size of 10 GB or less for every partition key value.
Online Index OperationsYou can establish global secondary indexes at the same time as a table. Additionally, you can create a brand-new global secondary index for an existing table or remove an already-existing one.In addition to creating a table, you may also establish local secondary indexes. There are no restrictions on adding new local secondary indexes to already-existing tables or removing any already-existing ones.
Queries and PartitionsYou can query across all partitions of the table using a global secondary index.Using the partition key value from the query, a local secondary index enables you to perform a query over a specific cluster.
Read ConsistencyOnly eventual consistency is supported for queries on global secondary indexes.Both eventual consistency and strong consistency are options when executing a query against a local secondary index.
Provisioned Throughput ConsumptionEach global secondary index includes its own provided read and write throughput settings. A global secondary index's capacity units are used by queries or scans on the index, not by the underlying table.A local secondary index's queries or scans use up read capacity units from the base table. When you write to a table, the local secondary indexes in that table are likewise changed, and these updates deplete the base table's write capacity units.
Projected AttributesRead capacity units from the base table are used by queries or scans on a local secondary index. A table's local secondary indexes are updated when you write to it, and these updates deplete the base table's write capacity units.When requesting information from or scanning a local secondary index, you can request properties that are not reflected into the index.

local and global secondary indexes

Specifications of Secondary Indexes

The following information must be specified for each secondary Index in dynamoDB:

  • Whether a local secondary index or a global secondary index should be used to generate the index.
  • The index's name. For an index to be associated with a base table, both the name and the base table it is associated with must be distinct.
  • Any additional properties to project into the index from the base table. The table's key properties, which are dynamically projected into each index, are in supplementary to these attributes.
  • The index's key structure The String, Numeric, or Binary types must all be top-level attributes in the index key schema. The use of other data types, such as documents and sets, is prohibited. Depending on the kind of index, additional criteria for the key schema include:
    • Any numeric attribute of the base table may be used as the partition key for a global secondary index. Any scalar property of the underlying table can be used as the sort key, which is optional.
    • The partition key for a local secondary index must match the partition key of the base table, and the sort key must be a non-key characteristic of the base table.

specifications of secondary indexes

Limitations of Indexes

  • A dynamoDB index in AWS entails some overhead. The index itself takes up storage and memory space (when used). Therefore, having too many indexes may be a problem if memory or space is at a premium.
  • The dynamoDB index must be kept up to date if data is added, changed, or deleted in addition to the initial data. Updates are slowed down, and the tables (or specific parts of the tables) are locked as a result, which may impede query processing.
  • Data modifications would take longer if you index all columns in each and every table.
  • This is not a problem if the data is static. It might be a problem, though, if indexes use all the memory.
  • Index slows down insert, update, and delete operations.

Creating Table With Secondary Indexes

A secondary index's data is finally made consistent with its corresponding table thanks to DynamoDB. For a table or a local secondary index, you can ask for strongly consistent Query or Scan actions. Global secondary indices, however, merely encourage eventual consistency.

A global secondary index can be added to an existing table by utilizing the UpdateTable action and the GlobalSecondaryIndexUpdates option.

The necessary inputs are required for UpdateTable:

  • TableName -
    The table that will be connected to the index.
  • AttributeDefinitions -
    The essential schema properties of the index's key data types.
  • GlobalSecondaryIndexUpdates -
    The index you wish to create's specifics:
    • IndexName - The index's name.
    • KeySchema - The characteristics that serve as the primary key for the index.
    • Projection - Attributes that are copied from the table to the index. ALL in this context denotes that every attribute has been copied.
  • Data from the table is backfilled into the new index as part of this operation. The table is still usable during backfilling. The index won't be ready, though, until the Backfilling property switches from true to false. To access this attribute, use the DescribeTable action.

Conclusion

  • The main distinction is that whereas secondary indexes are optional, the primary index must exist. While secondary indexes just let you query the table's objects using different properties, primary indexes allow you to uniquely identify every item.
  • A dynamoDB index in AWS entails some overhead. The index itself takes up storage and memory space (when used). Therefore, having too many indexes may be a problem if memory or space is at a premium.
  • Per table, a maximum of 20 global secondary indexes and 5 local secondary indexes may be created.
  • The cost of configuring DynamoDB indexes is nothing. However, if the index is impacted when you write data to the table, it will use provided WCUs for this transaction.
  • The operations GetItem and GetBatchItem cannot be used on GSIs. You can still run queries and scans on them, though.
  • You can use the DynamoDB scan operation to get the same query result. Scan operations, on the other hand, examine all items in a table, which takes longer than query operations, which only access objects at particular indices.
  • If you need to quickly query for some data but the table's main index does not support that query pattern, global/local secondary indexes can be helpful.