Tonight I’m presenting at the Big Data and Machine Learning – London meetup. I appreciate it’s a bit late notice but there’s still a few spaces left, based on the over-occupancy planning model of the organisers that most meetups suffer a 60{d093e0ed3b34f7b2723df508a98ef00fff93fa564feecd6d0dc6e7ce42b939fd} no-show rate.
The topic…
Rather than the usual slideware, I thought I’d get brave. Using a public dataset, we’ll look at how you might use a Jupyter notebook to characterise and explore the dataset. In a “here’s one I made earlier” kind of way, I’ll show the kind of data quality issues that come up with imperfect real world scenarios. Then I’ll show some not-so-immediately-obvious conclusions derived from this analysis, and go on to show how you’d build an XGBoost predictive model from it for a specific business use case.
So yes, python without a safety net. I’ll leave out all the fun and games I had getting XGboost compiled on my machine in the first place 🙂