Hello Data PMs,
This is a post you might want to save. Because I know you won’t get by it in one read. This post is about what topics from the world of Data Science you need to cover as a Data PM! Now, I could have written a series (I know another one!) but I thought some folks have written good, concise, and relevant information already, so why re-create? Rather I gave the most important concepts a structure so that you don’t have to spend hours googling, or glancing over data science resources and concepts for hours.
Also, the “Links of the Week” section is a gem this week. Read them, all!
Data Science and Analytics for Data PMs
I was wondering what three main buckets a Data PM should learn around the world of Data Science and I concluded the following:
(a) you should know what Data Scientists do and how that differs from an analyst or an ML engineer;
(b) some fundamental topics around Data Science and what matters to a PM specifically; and
(c) what are the major use cases and techniques that data scientists use?
So, I have curated a set of different resources that you can do as a week-long course if you were to catch up on all Data Science concepts and draft requirements well to keep your Data Scientists and Data Analyst colleagues happy!
1. Data Scientists vs. Data Analysts
This article in Forage provides a good distinction between the skills, career paths, as well as roles & responsibilities of Data Analysts and Data Scientists. The bottom line is if you want to get Exploratory Data Analysis done, you will work with a Data Analyst, and if you are working to predict something or create machine learning models, you will interact with a Data Scientist.
Further, there are other roles too. So, here’s a breakdown of the roles and responsibilities of four commonly confused roles: data analyst, data engineer, ML engineer, and data scientist.
2. Fundamentals of Data Science
Alright, so if you want to understand the very basics of Data Science start here. This article talks about the data science lifecycle as well as methods that are used for Data Science work, including work tools.
If you want to know how data science is different from business intelligence, and artificial intelligence, look at this article from Heavy.ai.
Once you understand the generic data science terms, get to specific terms/concepts that are very crucial for PMs in these three MUST-READ articles:
Analytics and Data Science Fundamentals for Product Managers
Statistics and Mathematics: What Product People Need To Know
And, if you want to learn about Data Visualization specifically, this is it. All the resources you will ever need.
Finally, if you want to go deeper, this is the most comprehensive resource to learn it all and can be your ready reference.
3. Data Science Techniques and Use Cases PMs should know.
Before we get to use cases, understand the landscape of Data Science and how various techniques fit together. Dr. Steven Karenfort talks about the Data Science Landscape and some major techniques in his article here. He then maps out major sub-disciplines which could be all separate crash courses, but you can get to that later.
Next,
has a pretty good list of high-level concepts around data science projects using some nasty visuals. He talks about how people use data science for 4 major use cases: optimization, targeting, abnormalities, and prediction.Then, once you know what data science can do and the type of work it covers, look at this list of different applications across industries by Megha Sinha to learn practical use cases.
Want to go further deeper? Read this article about real-life examples by Built-in.
And, lastly, to ensure you haven’t missed anything, double-check using all major data science techniques listed here. Definitely understand Transfer Learning, Neural Networks, Classification, Dimensionality Reduction, etc.
So, to sum it up, data scientists use data and process that data to serve the following use cases(source):
Prediction (predict a value based on inputs)
Classification (e.g., spam or not spam)
Recommendations (e.g., Amazon and Netflix recommendations)
Pattern detection and grouping (e.g., classification without known classes)
Anomaly detection (e.g., fraud detection)
Recognition (image, text, audio, video, facial, …)
Actionable insights (via dashboards, reports, visualizations, …)
Automated processes and decision-making (e.g., credit card approval)
Scoring and ranking (e.g., FICO score)
Segmentation (e.g., demographic-based marketing)
Optimization (e.g., risk management)
Forecasts (e.g., sales and revenue)
🔗 Links of the week
If you have one market insight piece to read this week, read Sequoia Capital’s Generative AI Act 2. An interesting take I had was that Sequoia’s saying that applications won’t produce enough data and, therefore, data won’t be the moat, but how companies integrate and build workflows and user networks seems to be creating a competitive advantage for companies.
The second wonderful link is about how you can create great Data UX by
. He explains what Data UX means and what that means for storing data, querying data, representing data and insights across applications.Alright, this edition can’t be complete without talking about GPT-4’s Vision release. Here’s a detailed analysis of various tests that two folks at Roboflow conducted, read this to know whether GPT-4 hallucinates on vision or not.
A resource to share on Slack: Multi-modal AI is a UX problem. (Remember, I told you to build thought leadership internally? I’ll help out! Technically, you can share all four of these!)
Signing off,
Richa
Chief Data Obsessor, The Data PM Gazette.
Thanks for the mention. This is a great collection of links.