In our RAMP platform architecture, one challenge we’ve had to address is how to simulate the injection of streaming data. We need this for demonstration purposes, but also to simulate load, test monitoring and alerting etc. Our solution uses AWS Lambda to pull sample data from DynamoDB and inject data as if it’s being sourced from an IoT device. We use the AWS IoT python APIs to simulate this, and then the data flows into a Kinesis stream, with Firehose as a means of backing up the stream contents for debug and replay purposes.
So it’s quite a lot of effort really!
Hence we were pleased to see that AWS have just published some open source code to simplify some of these streaming data generation needs, which they are calling the Amazon Kinesis Data Generator (KDG). It doesn’t completely cover all of our needs, but it’s a useful addition to the toolkit, especially for high data-rate streams and streams carrying random data. Take a look here for more details.