AWS step function is used to orchestrate the workflow by coordinating multiple AWS services. Step function will have multiple tasks which can call other services. Each service expects different input parameters. As the workflow moves from one task to another task, the Step function needs a way to manipulate and filter the input JSON.
In this blog post, we will see various fields which can be used to manipulate and filter the input JSON.
In the Step function state definition, we can use the below fields that filter and control the flow of JSON from state to state.
- InputPath
- Parameters
- ResultSelector
- ResultPath
- OutputPath
Step function uses an Amazon States Language to define the definition. In Amazon States Language
- A path is a string beginning with $ that we can use to identify the JSON.
- A reference path is a path that identifies a single node in a JSON structure using dot (.) and square bracket ([ ]) notation.
For example, if state input data contains the following values:
{
"companyName": "XYZ",
"employeeName": ["John", "Eric", "William"],
"department": {
"software": true
}
}
The following reference paths would return the following.
$.companyName=> "XYZ"
$.employeeName=> ["John", "Eric", "William"]
$.department.software=> true
InputPath
InputPath is used to select a portion of the state input. For example, suppose the input to your state includes the following.
{
"comment": "InputPath Example",
"employee": {
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32
},
"address": {
"home": "Home address",
"office": "Office address"
},
"country": "India"
}
If we are interested only in address of the employee, we can use InputPath "InputPath": "$.address"
With the previous InputPath, the below JSON is passed as the input to the next state.
{
"home": "Home address",
"office": "Office address"
}
We can use a Data flow simulator to verify on the AWS console.
Parameters
With the Parameters field, we can create a key-value pair that can be passed as input. The values can either be static values or a part of the input or the context object. For key-value pairs where the value is selected using a path, the key name must end in .$.
For example, suppose you provide the following input.
{
"comment": "Parameters Example",
"employee": {
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32
},
"address": {
"home": "Home address",
"office": "Office address"
},
"country": "India"
}
We can select some of the fields using parameters as below.
"Parameters": {
"comment": "Parameters Example",
"employeeDetails": {
"firstName.$": "$.employee.firstName",
"lastName.$": "$.employee.lastName",
"continent": "Asia"
}
}
Given the previous input and the Parameters field, this is the JSON that is passed.
{
"comment": "Parameters Example",
"employeeDetails": {
"firstName": "Rahul",
"lastName": "Lokurte",
"continent": "Asia"
}
}
From the above output, we see that the static field continent is added and firstName and lastName are derived from the input. Also notice, the key has $ at the end, so that, it knows, it should derive the data from input JSON.
ResultSelector
When the task of the state is completed, if we would like to manipulate the result of the task, we use the ResultSelector field. With the ResultSelector field, we can create a key-value pair, where the values are static or selected from the task's result. The output of ResultSelector replaces the task's result and is passed to ResultPath.
The step functions task usually returns the metadata along with the response. If we want only a few of the fields from the metadata, we can use ResultSelector to extract the data.
For example, suppose we get the output as given below.
{
"output": {
"RequestId": "ABCDEFGH",
"ExecutedVersion ": "$LATEST ",
"Payload": {
"statusCode ": 200,
"body": ["Hello World"]
},
"SdkHttpMetadata ": {
"AllHttpHeaders ": {
"X-Amz-Executed-Version": ["$LATEST "],
"x-amzn-Remapped-Content-Length": ["0 "],
"Connection": ["keep - alive "],
"x-amzn-RequestId": ["123456"],
"Content-Length ": ["28"],
"Content-Type ": ["application / json "]
},
"StatusCode ": 200
},
"outputDetails": {
"truncated": false
}
},
"resource": "invoke",
"resourceType": "lambda"
}
If we are interested only in resourceType and RequestId, we can use the ResultSelector as below.
"ResultSelector": {
"RequestId .$": "$.output.RequestId",
"ResourceType.$": "$.resourceType"
},
With the given input, using ResultSelector produces
{
"ResultSelector": {
"RequestId ": "ABCDEFGH",
"ResourceType": "lambda"
}
}
ResultPath
We want to control what JSON data needs to be passed to the task output. We have different combinations of data. We may want to just pass the result to output or we may want to combine the result of the task and the input which was given to the task. We can control all the scenarios using the ResultPath field.
Suppose, we have the input to the task as below.
{
"comment": "ResultPath Example",
"employee": {
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32
},
"address": {
"home": "Home address",
"office": "Office address"
},
"country": "India"
}
Let us consider that, we are passing this input to a lambda function, which generates a unique employee Id as the output.
{
"employeeId": "1234"
}
If we want all the input data which was given to the lambda and the result of lambda, we can use "ResultPath": "$.TaskResult"
The output includes the result of the Lambda function and the original input.
{
"comment": "ResultPath Example",
"employee": {
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32
},
"address": {
"home": "Home address",
"office": "Office address"
},
"country": "India",
"TaskResult": {
"employeeId": "1234"
}
}
We can also add the result of lambda into a child node of the input by using "ResultPath": "$.employee.outputlambda"
The output of the Lambda function is inserted as a child of the employee node in the input.
{
"comment": "ResultPath Example",
"employee": {
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32,
"outputlambda": {
"employeeId": "1234"
}
},
"address": {
"home": "Home address",
"office": "Office address"
},
"country": "India"
}
OutputPath
We can select the portion of the task output and pass it to the next state using OutputPath . This helps us to filter irrelevant information, and pass only the portion of JSON that is relevant.
Suppose, we have the output as below.
{
"comment": "ResultPath Example",
"employee": {
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32
},
"address": {
"home": "Home address",
"office": "Office address"
},
"country": "India",
"TaskResult": {
"employeeId": "1234"
}
}
If we want only information about employees, we can use $.employee
and we get the below response.
{
"firstName": "Rahul",
"lastName": "Lokurte",
"age": 32
}
If you don't specify an OutputPath the default value is $. This passes the entire JSON node to the next state.
Conclusion
In this blog post, we saw various fields provided by Amazon state language which can be used to manipulate and filter the JSON contents in the Step function workflow. Depending on the requirement of a specific task, we can pass the portion of the input JSON and we can send a portion of output JSON to the next task.