Unmarshalling DynamoDB Items from Step Functions

Benoît BouréBenoît Bouré
7 min read

AWS Step Functions introduced two new features: variables and support for JSONata. JSONata is a lightweight query and transformation language for JSON data. Whoever has worked with Step Functions knows that this is a real game-changer! When I heard the news, I immediately saw the potential for many things that would previously require a Lambda function but would now be achievable “natively” in Step Functions.

One common task many Step Functions workflows do is fetching items from DynamoDB. However, the getItem or query tasks return the raw “unmarshalled” data from DynamoDB.

Example:

{
  "id": {
    "S": "1"
  },
  "name": {
    "S": "John"
  },
  "address": {
    "M": {
      "street": {
        "S": "123, 5th Avenue"
      },
      "postCode": {
        "S": "5555"
      },
      "city": {
        "S": "New York"
      }
    }
  },
  "age": {
    "N": "32"
  }
}

This format is not very practical to work with because you need to remember and know the field types in the paths (e.g., $.item.name.S). Additionally, there's an issue with certain values, such as numbers, which are encoded as strings (like age above). This makes it harder to perform simple operations like math and comparisons.

With the arrival of JSONata, I started wondering if we could use the Function Library to "visit" and decode DynamoDB objects (a.k.a unmarshall them).

Step One: A Simple Proof of Concept

Before getting into the nitty-gritty of Step Functions, I first wanted to make a quick proof of concept to see if JSONata would give us that possibility. Luckily, JSONata has a practical playground to try it out. After some time, I came to this simple solution:

(
  $unmarshall := function ($object) {(
      $type($object) = 'array'
        ? [$map($object, $unmarshall)]
        : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
      );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);

    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
        : $type in ['N'] ? $number($value)
        : $type in ['M'] ? $unmarshall($value)
        : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
        : $type in ['L'] ? [$map($value, $convertValue)]
        : $type in ['NS', 'Ns'] ? [$value.$number()]
        : $type in ['NULL', 'Null', 'Nul'] ? null
        : $error('Unsupported type: ' & $type);
  )};

  $unmarshall($);
)

What’s going on in there?

$unmarshall is a function that takes an object or an array as input. It visits the value and for each attribute, it converts the nested type objects into native values with $convertValue. It does so recursively for nested arrays and maps.

The final result is very similar to the JavaScript version.

Try it yourself!

Step Two: Use it With Step Functions

After proving it’s doable, the second step was to make it work within Step Functions.

My first attempt was to use the new Assign property and store the $unmarshall and $convertValue functions into variables of the same name. Then I tried to call them from the Output property.

{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Assign": {
    "unmarshall": "{% function ($object) { ... } %}",
    "convertValue": "{% function ($object) { ... } %}"
  },
  "Output": {
    "result": "{% $unmarshall($states.input.dyanmoDbItem) %}"
  }
}

But this did not work. For two reasons:

  1. As mentioned in this article and the doc, you can’t use the variables in the same state you assigned them.
{
    "error": "States.QueryEvaluationError",
    "cause": "The JSONata expression '$unmarshall($states.input.dyanmoDbItem)' specified for the field 'Output/result' threw an error during evaluation. T1006: Attempted to invoke a non-function"
}

This is because the Assign and Output steps are evaluated in parallel.

  1. Assigning functions in variables is not supported.

This is not mentioned anywhere in the doc, but you can’t assign a function to a variable (it must be a “real” value). I learned this when I tried to move the Assign part into a previous task to work around the previous limitation.

{
    "error": "States.QueryEvaluationError",
    "cause": "The JSONata expression 'function ($object) { ... }' specified for the field 'Assign/unmarshall' returned an unsupported result type."
}

After some thought and research, I figured that nothing prevents me from putting everything into a single expression. This expression can define the functions and return the final result (just like in the JSONata playground).

And because that expression evaluates to a value, the result would end up in that variable, ready to be used later.

{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Assign": {
    "unmarshalledItem": "{% (
  $unmarshall := function ($object) {(
    $type($object) = 'array' ?
      [$map($object, $unmarshall)]
      : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
    );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);

    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
      : $type in ['N'] ? $number($value)
      : $type in ['M'] ? $unmarshall($value)
      : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
      : $type in ['L'] ? [$map($value, $convertValue)]
      : $type in ['NS', 'Ns'] ? [$value.$number()]
      : $type in ['NULL', 'Null', 'Nul'] ? null
      : $error('Unsupported type: ' & $type);
  )};

  $unmarshall($states.input.dynamoDbItem);
) %}"
  }
}
🗒
Note: I kept the new lines inside Assign for readability, but for it to be a valid JSON/ASL, it must all go into a single line when deployed to Step Functions.

Testing it out:

Now I can use the $unmarshalledItem variable, which contains the result, anywhere in a later state.

{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Output": {
    "unmarshalledItem": "{% $unmarshalledItem %}"
  }
}

Alternatively, I could also return the result directly in the Output

{
  "Type": "Pass",
  "QueryLanguage": "JSONata",
  "Output": {
    "unmarshalledItem": "{% (
  $unmarshall := function ($object) {(
    $type($object) = 'array' ?
      [$map($object, $unmarshall)]
      : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
    );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);

    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
      : $type in ['N'] ? $number($value)
      : $type in ['M'] ? $unmarshall($value)
      : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
      : $type in ['L'] ? [$map($value, $convertValue)]
      : $type in ['NS', 'Ns'] ? [$value.$number()]
      : $type in ['NULL', 'Null', 'Nul'] ? null
      : $error('Unsupported type: ' & $type);
  )};

  $unmarshall($states.input.dynamoDbItem);
) %}"
  }
}

Step 3: Create a CDK Construct

After having a proof of concept that works, I wanted to put it all into a simple re-useable CDK construct:

import { CustomState } from "aws-cdk-lib/aws-stepfunctions";
import { Construct } from "constructs";

interface DynamoUnmarshallProps {
  path: string;
  variableName: string;
}

const generateUnmarshall = (path: string) => `{% (
  $unmarshall := function ($object) {(
    $type($object) = 'array' ?
      [$map($object, $unmarshall)]
      : $merge($each($object, function ($val, $key) {
            { $key: $convertValue($val) }
        })
    );
  )};

  $convertValue := function ($object) {(
    $type := $keys($object)[0];
    $value := $lookup($object, $type);

    $type in ['S', 'SS', 'Ss', 'B', 'BS', 'Bs'] ?  $value
      : $type in ['N'] ? $number($value)
      : $type in ['M'] ? $unmarshall($value)
      : $type in ['BOOL', 'Bool'] ? $value = 'true' or $value = true
      : $type in ['L'] ? [$map($value, $convertValue)]
      : $type in ['NS', 'Ns'] ? [$value.$number()]
      : $type in ['NULL', 'Null', 'Nul'] ? null
      : $error('Unsupported type: ' & $type);
  )};

  $unmarshall(${path});
) %}`;

export class DynamoUnmarshall extends CustomState {
  constructor(scope: Construct, id: string, props: DynamoUnmarshallProps) {
    const { path, variableName } = props;

    super(scope, id, {
      stateJson: {
        Type: "Pass",
        QueryLanguage: "JSONata",
        Assign: {
          [variableName]: generateUnmarshall(path),
        },
      },
    });
  }
}

The construct takes two input parameters:

  • path: The JSONata path of the raw DynamoDb item to unmarshall

  • variableName the name of the variable where to store the result

const unmarshall = new DynamoUnmarshall(this, "Unmarshall", {
  path: "$states.input.dynamoDbItem",
  variableName: "unmarshalledItem",
});

You can find a fully working example on GitHub.

Please note that this was for experimental and learning purposes only. I do not guarantee that this code will work in all cases. I did not test this thoroughly and it should not be considered production-ready.

Conclusion

Unmarshalling DynamoDB within Step Functions lets developers access their data more easily. Previously, developers needed to remember to include field types in their paths or use a Lambda function. By embedding the logic into a reusable CDK construct, it simplifies the process by hiding the complexity of the logic.

4
Subscribe to my newsletter

Read articles from Benoît Bouré directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Benoît Bouré
Benoît Bouré

👨‍💻 Sofware Engineer · 💬 Consultant · 🚀 Indie Maker · ⚡️ Serverless 🔧 serverless-appsync-plugin · serverless-appsync-simulator · middy-appsync 🐈 and 🍵