How do we measure the cost or ROI of an AI investment if we do not first implement a data quality process?
Short answer:
If we skip data quality work up front, measuring ROI becomes far more uncertain and potentially misleading—because you won’t know whether disappointing results came from the AI itself or from flawed data.
Let’s be clear about how this plays out in practice so you can weigh the trade-offs:
1. What Happens If We Skip Data Quality?
When you plug inconsistent, incomplete, or poorly defined data into AI models:
- Model Performance Drops:
Predictions (like churn risk or next-best offer) will be less accurate, making them harder to trust and act on. - User Confidence Erodes:
Sales, marketing, or operations teams see unreliable recommendations and disengage. Adoption drops off. - Outcomes Vary Wildly:
One segment may show lift, another may show no effect, but you won’t know if that’s due to business reality or dirty data. - ROI Attribution Is Murky:
When results are mixed, it’s impossible to draw a clean line between investment and outcome.
2. How ROI Measurement Typically Works
In a structured AI initiative, ROI is calculated by comparing:
-
Baseline Performance:
Historical conversion rates, revenue per customer, cycle times, etc. -
Post-AI Performance:
Measured over a defined pilot period. -
Attribution Adjustments:
Controls or benchmarks to isolate the effect of AI from other factors (seasonality, promotions, etc.).
If your underlying data is poor, these comparisons lose validity, because:
-
Baselines may be inaccurate (e.g., sales histories missing transactions).
-
Post-AI measurements could reflect model errors due to bad inputs.
-
Attribution adjustments can’t account for all the noise.
3. What You Can Do If You Proceed Without Cleanup
You can still move forward, but you need to manage expectations:
- Set Tighter Scope:
Focus on AI use cases where data quality is relatively better (like call summarization or basic email personalization). - Pilot in Controlled Environments:
Limit rollout to a subset of products, regions, or customers to reduce variability. - Define “Exploratory ROI”:
Instead of expecting precise financial impact, treat initial results as directional insights:-
Are model outputs logically sound?
-
Do early adopters find them actionable?
-
Where are the biggest data quality gaps?
-
- Prepare to Iterate:
Expect that your first results will reveal where cleanup is most needed.
4. Measuring ROI Without Strong Data Foundations
If you still want to quantify ROI, here’s what that looks like:
-
Qualitative Feedback:
Surveys of users to assess perceived usefulness and trust. -
Directional Uplift:
Comparison of pilot vs. control group performance, acknowledging uncertainty. -
Process Metrics:
Reduction in manual effort (e.g., faster proposal drafting, less time qualifying leads). -
Data Improvement Signals:
Cataloging data errors found by the AI, which can be viewed as value-in-kind (accelerated data discovery).
Be prepared to say:
“In this phase, ROI is measured less by precise financial lift and more by readiness gains and process learnings.”
5. Bottom Line
If you skip data quality first, expect ROI measurement to be directional, not definitive.
You will still get valuable insights, but you should:
-
Treat initial phases as pilots with learning objectives.
-
Avoid hard ROI promises tied to specific dollar figures.
-
View early outcomes as a test bed to prove value and build the business case for cleanup.
Recommendation:
If you want a more confident, defensible ROI story—and higher likelihood of adoption—investing at least targeted data cleanup alongside the pilot is strongly advisable.
If you like, I can outline what a lightweight data quality sprint might look like so we get the best of both worlds: quick starts and credible measurement.