There’s been a clever (if that is the appropriate collective noun?!?!) of Amazon SageMaker features announced in the last few days. These are significant enough to warrant a brief comment.
The main one I wanted to discuss was automatic model tuning – see here for details. This allows you to do a search across a defined hyperparameter space for the best performing model, where best is determined by the evaluation metrics you define. This is a must-have feature for SageMaker that we’ve been waiting for, and the equivalent to GridSearchCV and RandomizedSearchCV that you might use in a scikit-learn world. But with the added benefit that you can throw some very cheap and cost-effective scale at it if you need to and still have control over costs and run duration. There’s an analytics object available so you can get at a pandas dataframe of the hyperparameter optimisation performance – see here for more details.
Other goodies
A second new feature – you can now clone pre-existing training jobs in the console – see here. Virtually all our training jobs are controlled and initiated from python/Jupyter notebooks so this is less important to us. But when you first pick up SageMaker on a training course, it’s a feature you just expect to be there. So good to have, and useful for a quick console-based retrain with a few mods.
Finally – SageMaker is now supported in CloudFormation. Again this is something we’ve need needing and waiting for – dead handy!