I’ve been a fan of David Mertz since I devoured (and practically lived out of) his book Text Processing in Python. So I was thrilled at the chance to be a technical reviewer for his new book Cleaning Data for Effective Data Science: Doing the other 80% of the work with Python, R, and command-line tools.
Cleaning data can be the most important, the hardest, and the most ill-defined aspect of a project. And this book goes deep into the topic and is full of practical suggestions and code. This is going to become an essential read for data science.