I normally avoid mentioning whatever day job I’m holding down in the context of my writing, for privacy, safety, and the ability to speak freely, but I’m going to break that rule today.
In 2021, I was writing intensive data science courses at work. My then-manager asked me to throw in a small course on data quality. I didn’t want to do it, preferring the complexity and depth of the five-to-six-hour courses I was then developing. She insisted. I researched and wrote it. In retrospect, it is the single most important thing I produced while I was there. For the next three years, I proceeded to pester people about making it public.
You can now find it here. It’s about an hour long. The public version has most of what was in the 2021 version, but I had to revise it extensively to focus on machine learning rather than data science in order to get it approved for release. (I was required to cut everything about cartographic misinformation, but you can remedy that by reading Mark Monmonier’s classic How to Lie with Maps.)
Since it will take time to read through the course, I’ll take the subject up again in two weeks or so.
Typos now fixed.