Lambda Execution Models

Since returning to Turner, I’ve been implementing a lot more tools in Lambda, DynamoDB and Step Functions which has been a somewhat new experience.

In many cases I need to execute a function 100+ times. Step Functions doesn’t offer a method of fanning out N number of times with a different event for each step. There are two methods I’ve used to solve this issue.

99 Bottles of Beer

The 99 Bottles model leverages Step Functions. A function generates a list of items to work on. The list is passed to the worker function as an array. The worker function “takes one down” and does what needs to be done. The function exists with the list passed in, minus the item completed. When the list is empty, the function sets a completed flag which the Step Function state machine uses to move on to the next step.

StateMachine

  • load-account generates the list of items to process
  • more-to-do is the logic to determine if it’s time to move onto the next step. It checks for the completed flag as such:
"more-to-do": {
    "Type": "Choice",
    "Choices": [{
            "Variable": "$.Complete",
            "BooleanEquals": true,
            "Next": "write-to-s3"
        },
        {
            "Variable": "$.Complete",
            "BooleanEquals": false,
            "Next": "do-account"
        }
    ],
    "Default": "error-handler"
}
  • do-account is the function that does the actual work. When the list of items is 0, it sets event[‘Complete’] to true.
  • write-to-s3 is the completion function that executes once all the items have been processed

Spray & Pray

While the 99 Bottles model processes items serially, the Spray & Pray model processes items in parallel. The parallelization leverages an SNS Topic to which the worker function is subscribed to. The trigger function generates the list of items to be worked and published an SNS message for each item. The Worker function is triggered by the SNS message. All N items are processed at the same time.

The disadvantage to Spray & Pray is the fact that there is no (easy) way to wait for completion and capture the results of all N functions. Step Functions can support waiting for the timeout interval of the worker function, but if the results are needed then they need to be either stored in DynamoDB or possibly leverage an SQS Queue.