Unmarshalling DynamoDB Items from Step Functions


AWS Step Functions introduced two new features: variables and support for JSONata. JSONata is a lightweight query and transformation language for JSON data. Whoever has worked with Step Functions knows that this is a real game-changer! When I heard the news, I immediately saw the potential for many things that would previously require a Lambda function but would now be achievable “natively” in Step Functions.
One common task many Step Functions workflows do is fetching items from DynamoDB. However, the getItem
or query
tasks return the raw “unmarshalled” data from DynamoDB.
Example:
{
"id": {
"S": "1"
},
"name": {
"S": "John"
},
"address": {
"M": {
"street": {
"S": "123, 5th Avenue"
},
"postCode": {
"S": "5555"
},
"city": {
"S": "New York"
}
}
},
"age": {
"N": "32"
}
}
This format is not very practical to work with because you need to remember and know the field types in the paths (e.g., $.item.name.S
). Additionally, there's an issue with certain values, such as numbers, which are encoded as strings (like age
above). This makes it harder to perform simple operations like math and comparisons.
With the arrival of JSONata, I started wondering if we could use the Function Library to "visit" and decode DynamoDB objects (a.k.a unmarshall them).
Step One: A Simple Proof of Concept
Before getting into the nitty-gritty of Step Functions, I first wanted to make a quick proof of concept to see if JSONata would give us that possibility. Luckily, JSONata has a practical playground to try it out. After some time, I came to this simple solution:
(
$unmarshall := function ($object) {(
$type($object) = 'array'
? [$map($object, $unmarshall)]
: $merge($each($object, function ($val, $key) {
{ $key: $convertValue($val) }
})
);
)};
$convertValue := function ($object) {(
$type := $keys($object)[0];
$value := $lookup($object, $type);
$type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ? $value
: $type in ['N'] ? $number($value)
: $type in ['M'] ? $unmarshall($value)
: $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
: $type in ['L'] ? [$map($value, $convertValue)]
: $type in ['NS', 'Ns'] ? [$value.$number()]
: $type in ['NULL', 'Null', 'Nul'] ? null
: $error('Unsupported type: ' & $type);
)};
$unmarshall($);
)
What’s going on in there?
$unmarshall
is a function that takes an object or an array as input. It visits the value and for each attribute, it converts the nested type objects into native values with $convertValue
. It does so recursively for nested arrays and maps.
The final result is very similar to the JavaScript version.
Step Two: Use it With Step Functions
After proving it’s doable, the second step was to make it work within Step Functions.
My first attempt was to use the new Assign
property and store the $unmarshall
and $convertValue
functions into variables of the same name. Then I tried to call them from the Output
property.
{
"Type": "Pass",
"QueryLanguage": "JSONata",
"Assign": {
"unmarshall": "{% function ($object) { ... } %}",
"convertValue": "{% function ($object) { ... } %}"
},
"Output": {
"result": "{% $unmarshall($states.input.dyanmoDbItem) %}"
}
}
But this did not work. For two reasons:
- As mentioned in this article and the doc, you can’t use the variables in the same state you assigned them.
{
"error": "States.QueryEvaluationError",
"cause": "The JSONata expression '$unmarshall($states.input.dyanmoDbItem)' specified for the field 'Output/result' threw an error during evaluation. T1006: Attempted to invoke a non-function"
}
This is because the Assign
and Output
steps are evaluated in parallel.
- Assigning functions in variables is not supported.
This is not mentioned anywhere in the doc, but you can’t assign a function to a variable (it must be a “real” value). I learned this when I tried to move the Assign
part into a previous task to work around the previous limitation.
{
"error": "States.QueryEvaluationError",
"cause": "The JSONata expression 'function ($object) { ... }' specified for the field 'Assign/unmarshall' returned an unsupported result type."
}
After some thought and research, I figured that nothing prevents me from putting everything into a single expression. This expression can define the functions and return the final result (just like in the JSONata playground).
And because that expression evaluates to a value, the result would end up in that variable, ready to be used later.
{
"Type": "Pass",
"QueryLanguage": "JSONata",
"Assign": {
"unmarshalledItem": "{% (
$unmarshall := function ($object) {(
$type($object) = 'array' ?
[$map($object, $unmarshall)]
: $merge($each($object, function ($val, $key) {
{ $key: $convertValue($val) }
})
);
)};
$convertValue := function ($object) {(
$type := $keys($object)[0];
$value := $lookup($object, $type);
$type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ? $value
: $type in ['N'] ? $number($value)
: $type in ['M'] ? $unmarshall($value)
: $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
: $type in ['L'] ? [$map($value, $convertValue)]
: $type in ['NS', 'Ns'] ? [$value.$number()]
: $type in ['NULL', 'Null', 'Nul'] ? null
: $error('Unsupported type: ' & $type);
)};
$unmarshall($states.input.dynamoDbItem);
) %}"
}
}
Assign
for readability, but for it to be a valid JSON/ASL, it must all go into a single line when deployed to Step Functions.Testing it out:
Now I can use the $unmarshalledItem
variable, which contains the result, anywhere in a later state.
{
"Type": "Pass",
"QueryLanguage": "JSONata",
"Output": {
"unmarshalledItem": "{% $unmarshalledItem %}"
}
}
Alternatively, I could also return the result directly in the Output
{
"Type": "Pass",
"QueryLanguage": "JSONata",
"Output": {
"unmarshalledItem": "{% (
$unmarshall := function ($object) {(
$type($object) = 'array' ?
[$map($object, $unmarshall)]
: $merge($each($object, function ($val, $key) {
{ $key: $convertValue($val) }
})
);
)};
$convertValue := function ($object) {(
$type := $keys($object)[0];
$value := $lookup($object, $type);
$type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ? $value
: $type in ['N'] ? $number($value)
: $type in ['M'] ? $unmarshall($value)
: $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
: $type in ['L'] ? [$map($value, $convertValue)]
: $type in ['NS', 'Ns'] ? [$value.$number()]
: $type in ['NULL', 'Null', 'Nul'] ? null
: $error('Unsupported type: ' & $type);
)};
$unmarshall($states.input.dynamoDbItem);
) %}"
}
}
Step 3: Create a CDK Construct
After having a proof of concept that works, I wanted to put it all into a simple re-useable CDK construct:
import { CustomState } from "aws-cdk-lib/aws-stepfunctions";
import { Construct } from "constructs";
interface DynamoUnmarshallProps {
path: string;
variableName: string;
}
const generateUnmarshall = (path: string) => `{% (
$unmarshall := function ($object) {(
$type($object) = 'array' ?
[$map($object, $unmarshall)]
: $merge($each($object, function ($val, $key) {
{ $key: $convertValue($val) }
})
);
)};
$convertValue := function ($object) {(
$type := $keys($object)[0];
$value := $lookup($object, $type);
$type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ? $value
: $type in ['N'] ? $number($value)
: $type in ['M'] ? $unmarshall($value)
: $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
: $type in ['L'] ? [$map($value, $convertValue)]
: $type in ['NS', 'Ns'] ? [$value.$number()]
: $type in ['NULL', 'Null', 'Nul'] ? null
: $error('Unsupported type: ' & $type);
)};
$unmarshall(${path});
) %}`;
export class DynamoUnmarshall extends CustomState {
constructor(scope: Construct, id: string, props: DynamoUnmarshallProps) {
const { path, variableName } = props;
super(scope, id, {
stateJson: {
Type: "Pass",
QueryLanguage: "JSONata",
Assign: {
[variableName]: generateUnmarshall(path),
},
},
});
}
}
The construct takes two input parameters:
path
: The JSONata path of the raw DynamoDb item to unmarshallvariableName
the name of the variable where to store the result
const unmarshall = new DynamoUnmarshall(this, "Unmarshall", {
path: "$states.input.dynamoDbItem",
variableName: "unmarshalledItem",
});
You can find a fully working example on GitHub.
Conclusion
Unmarshalling DynamoDB within Step Functions lets developers access their data more easily. Previously, developers needed to remember to include field types in their paths or use a Lambda function. By embedding the logic into a reusable CDK construct, it simplifies the process by hiding the complexity of the logic.
Subscribe to my newsletter
Read articles from Benoît Bouré directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Benoît Bouré
Benoît Bouré
👨💻 Sofware Engineer · 💬 Consultant · 🚀 Indie Maker · ⚡️ Serverless 🔧 serverless-appsync-plugin · serverless-appsync-simulator · middy-appsync 🐈 and 🍵