Efficient Data Retrieval in DynamoDB with GSIs and DynamoDBQueryExpression


Why I Wrote This
As I was building a Spring Boot microservice to fetch versioned data from DynamoDB, I had a simple goal: fetch relevant records fast, cleanly, and efficiently. The answer? Using DynamoDBQueryExpression
with Global Secondary Indexes (GSI).
In this post, I will walk you through:
What GSIs are and why they matter
How annotations like
@DynamoDBIndexHashKey
and@DynamoDBIndexRangeKey
workSchema-level decisions
Code samples with explanations
Pitfalls, best practices, and other tools you might use in different scenarios
First, What Is a Global Secondary Index (GSI)?
A GSI in DynamoDB is like creating a new view of your data that lets you query with different keys than your primary partition key (aka hash key).
Example:
Let’s say your table stores user actions:
{
"userId": "user-123",
"actionId": "act-456",
"actionType": "LOGIN",
"timestamp": "2025-07-01T10:30:00"
}
Now if your main table uses userId
as the hash key, but you want to query by actionType
, you’ll need a GSI with actionType
as the hash key and maybe timestamp
as the range key.
The Power of Annotations
To make this work in your Java model, DynamoDB uses annotations.
@DynamoDBIndexHashKey
This marks the attribute used as the partition key in a GSI.
@DynamoDBIndexRangeKey
This defines the sort key in the GSI.
@DynamoDBTable(tableName = "user_actions")
public class UserAction {
private String userId;
private String actionType;
private String timestamp;
@DynamoDBHashKey(attributeName = "userId")
public String getUserId() { return userId; }
@DynamoDBIndexHashKey(globalSecondaryIndexName = "actionType-timestamp-index")
public String getActionType() { return actionType; }
@DynamoDBIndexRangeKey(globalSecondaryIndexName = "actionType-timestamp-index")
public String getTimestamp() { return timestamp; }
}
Using DynamoDBQueryExpression
in Code
Once your table and GSI are set up, here’s how you query it:
javaCopyEditpublic List<UserAction> getUserActions(UserAction searchKey, String indexName, int limit) {
DynamoDBQueryExpression<UserAction> queryExpression =
new DynamoDBQueryExpression<UserAction>()
.withHashKeyValues(searchKey)
.withIndexName(indexName)
.withLimit(limit)
.withScanIndexForward(false) // sort descending
.withConsistentRead(false); // required for GSIs
QueryResultPage<UserAction> resultPage = dynamoDbMapper.queryPage(UserAction.class, queryExpression);
return resultPage.getResults();
}
Why Use DynamoDBQueryExpression?
DynamoDBQueryExpression
allows you to perform flexible and efficient queries against your DynamoDB tables using Java. It gives you the ability to:
Use secondary indexes.
Filter records.
Control pagination.
Choose sort order.
Instead of scanning the entire table, we narrow down results using hashKeyValues
, improving both speed and cost.
Understanding GSIs (Global Secondary Indexes)
Why Use GSIs?
When you need alternate access patterns.
When you want to query by an attribute other than the table’s partition key.
When you want to sort using a different attribute.
What Schema Changes Were Needed (DB Changes)?
When using GSIs,
you must define the index during table creation (or update via AWS Console or CloudFormation).
Update the Java model to include index annotations.
Ensure your query code references the correct GSI name.
Example GSI:
Name:
actionType-timestamp-index
Partition Key:
actionType
Sort Key:
timestamp
If you're using Infrastructure-as-Code, your CloudFormation snippet might look like:
CopyEditGlobalSecondaryIndexes:
- IndexName: actionType-timestamp-index
KeySchema:
- AttributeName: actionType
KeyType: HASH
- AttributeName: timestamp
KeyType: RANGE
Projection:
ProjectionType: ALL
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
When Not to Use This
There are limitations you should know:
While DynamoDB is fast and scalable, it comes with certain limitations:
Limited query flexibility: No joins or complex conditions.
Size restrictions: 400KB max item size.
Throughput constraints: Read/Write capacity units must be managed.
GSIs cost extra: Both in terms of performance and billing.
Pros and Cons
Pros
Fast read operations with indexed queries.
Cost-effective when using GSIs appropriately.
Scalable for millions of records.
Fine-grained control over pagination and sort order.
Cons
GSI maintenance increases storage cost.
Indexes need to be carefully planned upfront.
Inconsistent reads with GSI by default (need to explicitly opt into consistent reads).
What You Could Also Use (But I Didn’t)
FilterExpression: Use
.withQueryFilter()
to apply conditions on non-key attributes.Pagination: Use
.withExclusiveStartKey()
to implement infinite scrolling or batch processing.Sort Order Control:
.withScanIndexForward(false)
fetches results in descending order.
Why I Use This Approach
I needed a way to fetch audit history data efficiently based on alternate attributes. Using DynamoDBQueryExpression
with GSIs allowed me to:
Avoid full table scans.
Retrieve only the needed records.
Implement pagination for better performance.
This approach scales well and keeps the logic clean and modular.
Final Thought: Choose Indexing Wisely
Using DynamoDBQueryExpression
with GSIs provides a clean and efficient way to query your data without incurring the cost of scans. With thoughtful schema design and index usage, DynamoDB can be a highly performant NoSQL solution.
Would love to hear how you are using GSIs in your projects!
Subscribe to my newsletter
Read articles from Saurabh Rathi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Saurabh Rathi
Saurabh Rathi
I am a Java SpringBoot Developer.