There was a more insane than usual number of AWS announcements in my inbox when I booted up this morning. Normally AWS save them for the big Las Vegas re:Invent keynotes or one of the big summit events. So its telling that this is just “clearing the decks” ahead of the really big announcements next week at re:Invent! Many were related to AI and machine learning and specifically SageMaker.
Here’s a quick rundown of some of the key ones that caught my eye with some commentary on their significance:
- SageMaker training metrics can now be published into Cloudwatch. We’ve been doing this manually up until now, via python in Lambda or SageMaker notebooks. For example, recording training and evaluation metrics from the end of automated ML training cycles. This will allow visibility into metrics during a training run also.
- Two new SageMaker ML algorithms – Object2Vec looks particularly interesting for some of our use cases.
- Expose live ML video stream analysis as a SageMaker endpoint. We have a potential use case for this already from a customer project, so this is timely.
- Predictive auto-scaling of EC2 instances. This no doubt uses SageMaker-type functionality under the covers, and reminds me a little of what Anodot and Elasticsearch’s ML offerings do, i.e. it’s the “action” end of some of these anomaly detection mechanisms.
- SageMaker support for Apache Airflow – this is potentially very significant, as it’s an alternative to our direction of travel of using AWS Step Functions to orchestrate ML training etc. I have had 2 customers talk to me about using Airflow. As it’s open source they feel less “locked in”, but it comes at the cost of managing your own Airflow deployment. There is no easy Airflow-aaS that I am aware of (tell me!) and you need to manage complexity like this or this. Our view is to use PaaS (over software deployed on IaaS) as much as we can, as we know this leads to lower long-term TCO. Complexity = cost! So Step Functions are still a preference for us, but having more options is always good. For customers with the engineering depth to deploy and manage Airflow it’s a good option to consider.
Interesting developments! I don’t know if this counts as Airflow-aaS but it looks like it takes care of a lot of the setup: https://github.com/villasv/aws-airflow-stack.
For a managed solution, there’s https://www.astronomer.io/. It’s relatively new, but looks very promising. They have lots of open source Airflow hooks as well.