DynamoDB,Demystified(Chapter 4)

DynamoDB,Demystified(Chapter 4)

GSI(Global Secondary Index)

Welcome to the fourth chapter in the DynamoDB, Demystified Series. This chapter is a direct continuation of the third chapter. We'll be using the same database table and the data we created in chapter 3.

Today, we'll talk about the Global Secondary Index(GSI).

Global Secondary Index

One of the data access patterns in your application might be to query all products for a given category.
Let's say we want to retrieve all products under women clothing category. With our current key schema, it is not possible to do this query. This is because category is not part of our table’s composite primary key. So we need to create a secondary index to allow for additional access patterns. Secondary indexes are a powerful way to add query flexibility to a DynamoDB table.
DynamoDB has two kinds of secondary indexes: global secondary indexes and local secondary indexes. In this section, we would add a global secondary index to our category attribute that will allow us to retrieve all products in a particular category.
Create GSI
The following example script adds a global secondary index to an existing table(products table).

import boto3

# Boto3 is the AWS SDK library for Python.
# You can use the low-level client to make API calls to DynamoDB.
def create_gsi():

    client = boto3.client('dynamodb', endpoint_url="http://localhost:8000")

    try:
        resp = client.update_table(
            TableName="products",
            # Any attributes used in your new global secondary index must be declared in AttributeDefinitions
            AttributeDefinitions=[
                {
                    "AttributeName": "category",
                    "AttributeType": "S"
                },
            ],
            # This is where you add, update, or delete any global secondary indexes on your table.
            GlobalSecondaryIndexUpdates=[
                {
                    "Create": {
                        # You need to name your index and specifically refer to it when using it for queries.
                        "IndexName": "categoryIndex",
                        # Like the table itself, you need to specify the key schema for an index.
                        # For a global secondary index, you can use a simple or composite key schema.
                        "KeySchema": [
                            {
                                "AttributeName": "category",
                                "KeyType": "HASH"
                            }
                        ],
                        # You can choose to copy only specific attributes from the original item into the index.
                        # You might want to copy only a few attributes to save space.
                        "Projection": {
                            "ProjectionType": "ALL"
                        },
                        # Global secondary indexes have read and write capacity separate from the underlying table.
                        "ProvisionedThroughput": {
                            "ReadCapacityUnits": 1,
                            "WriteCapacityUnits": 1,
                        }
                    }
                }
            ],
        )

        print("Secondary index added!")
        return resp
    except Exception as e:
        print("Error updating table:")
        print(e)
if __name__ == '__main__':
    response = create_gsi()
    print("Secondary index created")

Creating a global secondary index has a lot in common with creating a table. You specify a name for the index, the attributes that will be in the index, the key schema of the index, and the provisioned throughput (the maximum capacity an application can consume from a table or index). Provisioned throughput on each index is separate from the provisioned throughput on a table. This allows you to define throughput granularly to meet your application’s needs.

Run the following command in your terminal to add your global secondary index.

python3 CreateGSI.py

This script adds a global secondary index called categoryIndex to our products table.

Now, let's Query all products with category as "women clothing" .
When you add a global secondary index to an existing table, DynamoDB asynchronously backfills the index with the existing items in the table. The index is available to query after all items have been backfilled. The time to backfill varies based on the size of the table.
Run this script to query all products under women clothing category.

import time
from pprint import pprint

import boto3
from boto3.dynamodb.conditions import Key

# Boto3 is the AWS SDK library for Python.
# The "resources" interface allows for a higher-level abstraction than the low-level client interface.
# For more details, go to http://boto3.readthedocs.io/en/latest/guide/resources.html
dynamodb = boto3.resource('dynamodb', endpoint_url="http://localhost:8000")
table = dynamodb.Table('products')

# When adding a global secondary index to an existing table, you cannot query the index until it has been backfilled.
# This portion of the script waits until the index is in the “ACTIVE” status, indicating it is ready to be queried.
while True:
    if not table.global_secondary_indexes or table.global_secondary_indexes[0]['IndexStatus'] != 'ACTIVE':
        print('Waiting for index to backfill...')
        time.sleep(5)
        table.reload()
    else:
        break

# When making a Query call, you use the KeyConditionExpression parameter to specify the hash key on which you want to query.
# If you want to use a specific index, you also need to pass the IndexName in our API call.
resp = table.query(
    # Add the name of the index you want to use in your query.
    IndexName="categoryIndex",
    KeyConditionExpression=Key('category').eq('women clothing'),
)

print("The query returned the following items:")
for item in resp['Items']:
    pprint(item,sort_dicts=False)

Run the script as

python3 QueryWithGSI.py

You should get this output in your terminal

The query returned the following items:
{'image': 'https://fakestoreapi.com/img/81XH0e8fefL._AC_UY879_.jpg',
 'createdDate': '2021-04-14T12:39:34+00:00',
 'price': Decimal('29.95'),
 'description': '100% POLYURETHANE(shell) 100% POLYESTER(lining) 75% POLYESTER '
                '25% COTTON (SWEATER), Faux leather material for style and '
                'comfort / 2 pockets of front, 2-For-One Hooded denim style '
                'faux leather jacket, Button detail on waist / Detail '
                'stitching at sides, HAND WASH ONLY / DO NOT BLEACH / LINE DRY '
                '/ DO NOT IRON',
 'id': Decimal('16'),
 'title': "Lock and Love Women's Removable Hooded Faux Leather Moto Biker "
          'Jacket',
 'category': 'women clothing',
 'yearManufactured': Decimal('2001')}
{'image': 'https://fakestoreapi.com/img/71z3kpMAYsL._AC_UY879_.jpg',
 'createdDate': '2021-04-16T12:22:41+00:00',
 'price': Decimal('9.85'),
 'description': '95% RAYON 5% SPANDEX, Made in USA or Imported, Do Not Bleach, '
                'Lightweight fabric with great stretch for comfort, Ribbed on '
                'sleeves and neckline / Double stitching on bottom hem',
 'id': Decimal('18'),
 'title': "MBJ Women's Solid Short Sleeve Boat Neck V ",
 'category': 'women clothing',
 'yearManufactured': Decimal('2000')}
{'image': 'https://fakestoreapi.com/img/51eg55uWmdL._AC_UX679_.jpg',
 'createdDate': '2021-04-14T12:25:38+00:00',
 'price': Decimal('7.95'),
 'description': '100% Polyester, Machine wash, 100% cationic polyester '
                'interlock, Machine Wash & Pre Shrunk for a Great Fit, '
                'Lightweight, roomy and highly breathable with moisture '
                'wicking fabric which helps to keep moisture away, Soft '
                'Lightweight Fabric with comfortable V-neck collar and a '
                'slimmer fit, delivers a sleek, more feminine silhouette and '
                'Added Comfort',
 'id': Decimal('19'),
 'title': "Opna Women's Short Sleeve Moisture",
 'category': 'women clothing',
 'yearManufactured': Decimal('2000')}
{'image': 'https://fakestoreapi.com/img/51Y5NI-I5jL._AC_UX679_.jpg',
 'createdDate': '2021-04-16T12:22:41+00:00',
 'price': Decimal('56.99'),
 'description': 'Note:The Jackets is US standard size, Please choose size as '
                'your usual wear Material: 100% Polyester; Detachable Liner '
                'Fabric: Warm Fleece. Detachable Functional Liner: Skin '
                'Friendly, Lightweigt and Warm.Stand Collar Liner jacket, keep '
                'you warm in cold weather. Zippered Pockets: 2 Zippered Hand '
                'Pockets, 2 Zippered Pockets on Chest (enough to keep cards or '
                'keys)and 1 Hidden Pocket Inside.Zippered Hand Pockets and '
                'Hidden Pocket keep your things secure. Humanized Design: '
                'Adjustable and Detachable Hood and Adjustable cuff to prevent '
                'the wind and water,for a comfortable fit. 3 in 1 Detachable '
                'Design provide more convenience, you can separate the coat '
                'and inner as needed, or wear it together. It is suitable for '
                'different season and help you adapt to different climates',
 'id': Decimal('15'),
 'title': "BIYLACLESEN Women's 3-in-1 Snowboard Jacket Winter Coats",
 'category': 'women clothing',
 'yearManufactured': Decimal('2001')}
{'image': 'https://fakestoreapi.com/img/61pHAEJ4NML._AC_UX679_.jpg',
 'createdDate': '22021-04-14T12:31:34+00:00',
 'price': Decimal('12.99'),
 'description': '95%Cotton,5%Spandex, Features: Casual, Short Sleeve, Letter '
                'Print,V-Neck,Fashion Tees, The fabric is soft and has some '
                'stretch., Occasion: Casual/Office/Beach/School/Home/Street. '
                'Season: Spring,Summer,Autumn,Winter.',
 'id': Decimal('20'),
 'title': 'DANVOUY Womens T Shirt Casual Cotton Short',
 'category': 'women clothing',
 'yearManufactured': Decimal('2000')}

This is a query pattern that would have been difficult with your table's main key schema but is easy to implement with the power of secondary indexes.

As always, you can find the complete source code on GitHub. github.com/trey-rosius/dynamodb3

Conclusion

In this article, we looked at how to add GSI to our table and use it to query data easily. As Always, I appreciate you checking this out.
I might have made a mistake somewhere in the article. If you catch anything, please let me know and I'll get to it immediately.
Don't forget to show a brother some love by liking and commenting too.

Happy Coding
Peace