As always, AWS announced loads of new features and services at re:Invent 2022. I’ve covered the data-related announcements in a previous post, and today I’ll be sharing some highlights from the serverless space.
Rather than cover all the announcements and write what could very easily be considered a short novel, I’m going to focus on some announcements from two of our favourite services here at Inawisdom. Those services are EventBridge and Step Functions.
For those who haven’t experienced these services before – here’s a quick intro…
Amazon EventBridge is a service that makes implementing event-driven-architectures (EDA) in the cloud easier and more maintainable. A key benefit of EDA is that large and distributed systems can be constructed from small, loosely coupled components, enabling greater agility and democratising the responsibility to the teams that own services. For example, when a new data asset is made available in an organisation, a DataAssetPublished event could be published. Other services could respond to this event asynchronously to carry out a number of tasks (e.g. send a notification, start an ETL pipeline).
AWS Step Functions is a way to implement state machines in AWS. It’s an easy way to orchestrate sequential workflows without writing any additional custom application logic. Step Functions comes with built-in visualisation and error handling to make monitoring and debugging of state machine executions more accessible. At Inawisdom, we use Step Functions extensively – not least in our Data Ingestion Framework accelerator, a way for us to help you get data from its source to your data lake or data warehouse in a short space of time.
Without further ado, let’s look at some announcements! Some of these were technically announced during pre:Invent, but they’re new nonetheless.
First up, EventBridge Pipes. Some context first – AWS have been on a definite trend towards tighter integration between their services, reducing or removing the need for ‘glue’ code. EventBridge Pipes is another big step in this direction; it’s a new service for connecting two services together with some optional steps in between.
At launch, there are 6 different options for the source of the pipe and 15 options for the target. An example of this would be a pipe connecting an SQS queue to a Step Functions state machine. Previously, to implement this you would need a Lambda function (or another way of running code) in the middle to consume from the queue and start the execution. EventBridge Pipes makes this much tidier and easier to monitor – plus, it’s one less piece of application code to maintain!
Pipes aren’t limited to just passing raw data from a source to a target. There are two additional features worth talking about: filtering and enrichment. Filtering is straightforward. It allows you to define a pattern for the messages that should be allowed through the pipe; those that don’t fit the pattern are discarded. Enrichment is a way to add data to the message before it reaches the target. For example, the message could be passed to a Lambda function which makes an API call to retrieve some additional data related to the message before being passed as input to the state machine. I think this is a great new addition and I’m looking forward to using it on new projects. I’ve published a deeper dive on my personal blog.
EventBridge Scheduler is another addition to the EventBridge suite of services. It’s a specialised service for managing both recurring and one-off scheduled tasks.
Recurring tasks were always possible with a standard EventBridge Rule; however EventBridge Scheduler adds significant value by adding timezones and daylight savings support. This means that something can be set to run at 8am London time every day and there’s no need to manually change the schedule when daylight savings starts/ends. Very handy!
Scheduling of one-off tasks is a new feature, something that previously would have required a custom implementation. This means that applications can now set up a task to run at a given time in the future without any additional management. The service supports tens of millions of tasks, ensuring that even the largest applications can take advantage.
Another useful feature within EventBridge Scheduler is flexible time windows, which enables you to schedule a task to be carried out at some point within a set time period, rather than having to define an exact time. This is useful in cases where downstream resources are prone to being overloaded and it is beneficial to spread the load across a longer period of time.
A feature that we saw come to Step Functions earlier in 2022 was integration with the AWS SDK, so that AWS API calls could be made without ‘glue’ code in the middle. EventBridge Scheduler now makes it easy to schedule these API calls without having to invoke a Lambda function.
Any system that requires scheduled actions to take place (I’m not sure I’ve designed one at Inawisdom that doesn’t) can make use of EventBridge Scheduler. Based on the current trajectory, EventBridge is going from strength to strength and cementing itself as a cornerstone of AWS architectures.
Step Functions Distributed Map
Time to move from events to state machines. Of course, the go-to for implementing state machines is Step Functions. Step Functions is a great way to orchestrate workflows made up of tasks that interact with a wide variety of AWS services.
One feature of Step Functions that was exceptionally useful was the ability to use Map states – these would iterate over a list of items, concurrently if desired, running through a set of states for each item. For example, the input to my Step Function could be a list of S3 object URIs; a Map state could then be used to invoke a Lambda function for each URI to carry out a particular task. There were limitations to this, however. Step Functions supported up to around 40 concurrent tasks in a Map state unless you got into the world of nested Step Functions (adding complexity). Step Functions itself also has hard quotas on how much data can be passed between states and the number of events that can take place in an execution. These quotas seem high, but larger workloads could quite easily breach them.
Enter Step Functions Distributed Map, a variant of the usual Map state that is designed to support massively parallel workloads at a greater scale than was previously possible. There are three key features I want to mention.
The first is that an S3 object can be designated as the input to the state. This means that a task earlier in your state machine could generate a CSV or JSON file with up to 100,000,000 records in it, store it in S3 and use the content of that file as the input to the Distributed Map state. With this, your state machine could then churn through the contents of that file, executing a set of states for each row. You might be thinking “But Alex, won’t 100 million records take ages to process at a concurrency of 40?” Keep on reading…
With Distributed Map states, maximum concurrency has been increased significantly. Now, 10,000 parallel executions are possible. Whilst this is great, keep in mind that you’ll ultimately be limited by your downstream resources. For example, putting 10,000 objects to S3 in parallel wouldn’t be an issue, however opening 10,000 connections to Redshift to insert records won’t end well.
The final feature to mention is improved control of the output of the Distributed Map state. Typically, as seen with the Map state type, the outputs of all iterations get accumulated and returned to the state machine. For example, if the outputs of the iterations are 1, 2 and 3, the output of the Map state would be [1, 2, 3]. When working at scale, this can cause problems with breaching the 256KB quota. With the Distributed Map state type, you’re able to write the results to an S3 bucket without having to write custom application code in a Lambda function (or similar).
Our CTO of AI/ML, Phil Basford, will be publishing an article all about Step Functions Distributed Map soon.
Step Functions Cross-Account Tasks
The last announcement I want to talk about is the ability to execute a task within a state machine in a different account, using a role assumed from the execution role of the state machine itself.
The previous pattern for carrying out most cross-account activities within a state machine would be to use some form of ‘glue’ code – this would often be a Lambda function, or some other form of compute such as an ECS/EKS task. Whilst there was nothing inherently wrong with this approach, it was another piece of code – another piece of infrastructure to monitor and manage.
Just before re:Invent, AWS announced the ability to define an IAM Role ARN for the state machine to assume when it carries out a particular task. This makes it significantly easier to orchestrate cross-account workflows without having to implement any additional logic. There are lots of use cases I can think of where this is going to simplify architectures, making solutions quicker to build and easier to maintain.
To round up, these 4 new announcements add real value to EventBridge and Step Functions and I can see how all of them can be used on upcoming projects. Watch out for future blog posts where we’ll be diving deeper technically into some of these services, demonstrating how they can be used on example workloads.