Notes from Industry

Framing business goals as modeling tasks is the true measure of a data scientist

Think of problem formulation as a bipartite graph, where the goal is to find the best method in the right column for the business task in the left column. Some tasks can be solved in multiple ways and some methods may not be a good fit for any tasks. Image by author.

Several prominent AI thought leaders tweeted earlier this year on the theme that data is more important in applied machine learning than model architecture and optimization. François Chollet wrote:

ML researchers work with fixed benchmark datasets, and spend all of their time searching over the knobs they do control: architecture & optimization. In applied ML, you’re likely to spend most of your time on data collection and annotation — where your investment will pay off.

Andrew Ng chimed in that he agrees with Chollet and that more work needs to be done to…


A persistent thread of commentary says data science is overhyped. Don’t believe it.

Data science has been hot for many years now, attracting attention and talent. There is a persistent thread of commentary, though, that says data science’s core skill of statistical modeling is overhyped and that managers and aspiring data scientists should focus on engineering instead. Vicki Boykis’ 2019 blog post was the first article I remember along these lines. She wrote:

…data science is asymptotically moving closer to engineering, and the skills that data scientists need moving forward are less visualization and statistics-based, and more in line with traditional computer science curricula…

With that premise, her reasonable advice was:

Don’t do…

Brian Kent

Data scientist in the wild. Follow me on for more on how to solve real-world problems with data science.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store