There’s been a clever (if that is the appropriate collective noun?!?!) of Amazon SageMaker features announced in the last few days.  These are significant enough to warrant a brief comment.

The main one I wanted to discuss was automatic model tuning – see here for details.  This allows you to do a search across a defined hyperparameter space for the best performing model, where best is determined by the evaluation metrics you define.  This is a must-have feature for SageMaker that we’ve been waiting for, and the equivalent to GridSearchCV and RandomizedSearchCV that you might use in a scikit-learn world.  But with the added benefit that you can throw some very cheap and cost-effective scale at it if you need to and still have control over costs and run duration.   There’s an analytics object available so you can get at a pandas dataframe of the hyperparameter optimisation performance – see here for more details.

Other goodies

A second new feature – you can now clone pre-existing training jobs in the console – see here.  Virtually all our training jobs are controlled and initiated from python/Jupyter notebooks so this is less important to us.  But when you first pick up SageMaker on a training course, it’s a feature you just expect to be there.  So good to have, and useful for a quick console-based retrain with a few mods.

Finally – SageMaker is now supported in CloudFormation.  Again this is something we’ve need needing and waiting for – dead handy!