Hi Data PMs,
First off — apologies for the long break. As I said in the first issue, I intended to be regular but then November happened. And, I couldn’t be more happy to spend some quality time with friends and families back home 🇮🇳. So much happened in the Data / AI land: Google’s Gemini, Mistral AI, and ML/LLM Ops are getting sexier by the week. Uff, too much to keep up!
Now, that I am back, I promise to be regular again. This week, we are going to deal with a topic that currently matters to me personally a lot, but is super important for Data PMs in particular: Rollouts!
What are rollouts? Why should you care as a Data PM?
A PM usually thinks the goal is to get the right features on the roadmap. However, a PM’s role doesn’t end with getting a feature on the roadmap, it ends with getting the feature in the hands of the customer. You will be surprised to know how many features are lying behind “feature flags” in your organization, and are not available to all customers, even though they can be. The act of releasing a feature to a customer is called a “rollout.”
As a Data PM, you are either working on an application feature, or you might be rolling out a new capability on the platform. In both these cases — this requires a transformation in the customer’s experience in interpreting data or working with their data. Therefore, rollout is much more than turning on/off a feature flag. Your rollout can make/break the core experience of your product without showing much visual change, and therefore, Data PMs have to be more cautious in planning and executing a rollout.
How bad can rollouts get? Is it really that important?
If you don’t plan a rollout well, you can be stuck in a rollout limbo where the feature you built for an audience hasn’t been released to the customers they are intended for. Let me explain this with an example.
Let’s say you are a Data PM for a Data Platform and you have built a much-requested new integration system and are rolling out the new integration system. You are an amazing PM, so you defined multiple cohorts/“waves” (at least that’s what the cohorts are called at Clari). You first rolled it out to only 5% of customers — waited to see if the integration system works as expected, and then later on, planned multiple different rollout waves and are rolling out as expected. But, it’s been 2 months since the first 5% wave, and the rollout is ongoing. What can go wrong? A LOT.
As you were rolling to one of the customers in your wave 3, their Customer Success Managers (CSM) came back and said “This customer has recently had some implementation issues” and if the rollout is not successful, there is a churn risk, please don’t roll out now. And, you had to defer from them to a later date, but the CSM didn’t give you a date by which you can roll out.
As you were rolling out to wave 4 of customers, there was a bug introduced by another team, and your new integration rollout wasn’t successful. You had to roll back customers in wave 4, and manage this “incident” (a much more shameful experience). Still, now you have told the customers they are moving to the new integration system, as per your communication plan. However, no communication went for rollback, and customers complained that they hadn’t seen the change.
For a given customer, the feature flags required to do the rollout weren’t configured as required, and the rollout even though was “successful”, the customer didn’t move to the new system.
For another customer, the new integration system stopped syncing data because of some configuration error at the customer’s end, which raised a P0 support ticket for your team (highest priority; fix in 4 hours, kind of Support ticket). And, your engineering team, Customer Support Team, etc. were all on high alert for a day.
For a few customers in different waves, the customer was wrongly qualified to be moved to the new integration system, so they saw errors on the apps, and the application side’s on-call engineers started complaining about this new integration system causing disruptions in their work.
Overall, the company feels that the new integration system is unreliable given the number of rollout failures, even though for the customers where the rollout was successful, it’s working as required.
Could this situation have been avoided in the first place?
How to do a rollout effectively?
To make a rollout effective, you will have to think through a variety of factors, not just internal but also on the customer side. Here’s a checklist that you can use to make a feature rollout go with least hiccups.
What are the qualification criteria?
What feature flags to be on before you consider that customer? {Technical qualification}
Do they need to have a specific configuration on your side? {Implementation qualification; size of data, integrations, access to certain features, etc.}
Do they need to be at a certain customer cycle? {GTM qualification; eg: renewal date, churn risk, etc.}
Are there any data checks/parity checks to be run before a customer qualifies? {Current state of system}
Define your cohort plan. This involves “how many waves you need” and “what is the qualification of each wave”.
Maybe you are rolling out based on how much time it takes to roll out the feature. Suppose it takes 2-3 days for the new migration to complete, and you have only one engineer working on a rollout, so you will roll out to maybe 2 customers in a week.
Maybe you are rolling out based on different configurations, so your waves will be based on configurations.
Maybe you are rolling out based on geographies, so your waves will be a result of that.
The important this is that you have a “test wave” that essentially tests whether your feature is working as expected and then only do you expose that to multiple customers.
And, even within your waves, rather than blindly pulling the trigger, someone should see if the rollout is going as intended by doing some checks post-rollout.
Finally, track those waves and cohorts and the rollout for each customer using a Jira board.
Is it a feature that you intend to make Generally Available or is this limited feature available on request?
Does this change the current behavior of a customer? If yes, who are the "approvers” for the rollout to a customer?
Customer itself?
GTM representative?
What’s the process of rollout? Define each step almost in a flow diagram.
The roles and responsibilities of each team, and the timeline should be clear for all actors involved in the rollout process.
Lastly, you might have figured all of this out, but define a communication plan, and do evangelization with the right stakeholders, so that your rollout can go smoothly.
But, what if things still go wrong?
Communicate to the customer, on priority. Never slack on customer communication.
Do a post-mortem of the situation, and see what needs to change.
Qualification criteria?
Cohort planning?
Rollout process?
Fix that.
Update the stakeholders with what happened, and how you are going to fix it in the future. Also, situate everyone with the magnanimity of the situation: was it just 1% of customers, who were impacted, or was it a handful? What could have the customer noticed? What are we going to do to avoid it in the future?
Links of the week 🔗
Are you hearing the term “ML Ops” a lot? Here’s a dictionary for ML Ops.
- ‘s recent issue of Artificial Ignorance keeps you updated with Google’s Gemini, updates from Meta AI land, as well as the new EU AI Act.
Lastly, as a Data PM, one of your biggest levers is cost, and
helps you understand how you can bring the cost of your platform down! A MUST READ!
Side note: I am hiring! If you are interested in a Product Manager role at Clari, please reach out!
Hope you enjoyed reading this edition, as much as I liked compiling these learnings. See you next week!
Cheers,
Richa
Your Chief Data Obsessor, The Data PM Gazette.
Thank you for the mention Richa!