Why leadership is right to ignore data quality (until they're not)
Three Citibank employees clicked the wrong boxes and sent $900 million dollars to Revlon’s lenders instead of $8 million. After going to court to try to get the money back, a judge let the lenders keep most of it. Although rare, these stories remind us why sectors like banking, healthcare, and energy/utilities treat data quality as life or death. Yet for every Citibank disaster, there are thousands of companies wasting millions preventing problems that don’t actually matter.
Scroll through any data engineering forum and you’ll find engineers begging for data quality investments while executives allocate budget elsewhere. Most of the time, executives are actually making the right call. Data teams should not care about data quality for every use case.
The 95% World
Most companies live here. Dashboards break, reports run late, analytics show weird numbers, and it's not a huge deal. This is because humans are using a report as one tool to make a low impact decision. Humans are also great at spotting anomalies. Bad data becomes an annoying Monday morning instead of an existential crisis.
Netflix’s internal dashboards live here. Executives routinely look at reports and dashboards about the business. If their “Shows watched by region” dashboard is stale for a week, some executive makes a slightly worse content decision. And that's perfectly fine. The cost of prevention exceeds the cost of the problem.
The 99.99% world
This is where heavily regulated sectors, and where data makes decisions automatically, millions of times, without human review. Netflix’s recommendation engine lives here. When something fails, it does so at scale. A bad recommendation means fewer users, corrupting training data, and a cascade of increasingly worse suggestions. A doom loop spinning faster than humans can catch.
In the 95% world bad, data means someone stays late to fix a spreadsheet. In the 99.99% world, bad data means you’re Citibank, explaining to a judge why you accidentally gave away $500 million dollars.
The AI Acceleration
AI is now pushing more use cases into the 99.99% world. Reports are now being written by ChatGPT. Pricing decisions are now algorithms. Customer service is now being automated with AI. Every AI system now turns data into instant, scaled, unsupervised decisions.
Uber started with human dispatchers checking spreadsheets. Now algorithms set prices every millisecond, moving billions of dollars with no human in the loop. They've shifted from the 95% world to the 99.99% world, and so has everyone else.
Speed vs Reliability
The math is brutal. Google learned this with uptime, and the same exponential cost curve applies to data quality. Google discovered that chasing perfect reliability makes products worse. Teams spent months building backup systems while actual features got delayed. Users couldn’t tell the difference between 99.9% and 99.99% uptime (43 versus 4 minutes of downtime per month).
So Google created error budgets. Let's say your service promises 99.9% uptime, that means you get 43 minutes of downtime to spend on shipping risky features. The cost curve becomes exponential with each "nine" of reliability costs 10X more than the last.
Your executives aren't wrong to ignore data quality today. But AI is moving every company from the 95% world to the 99.99% world. The question isn't whether you need data quality. It's whether you'll figure it out before or after your own $900 million mistake.
TL;DR
Most companies have a simple data anomaly detection stack, and that’s just fine. Humans catch the weird stuff before it matters. But when software makes decisions at scale, stakes are high, or data is your product, bad data becomes catastrophic. As the use of AI in businesses grows to automate tasks, so does the need for robust anomaly detection.
Catch issues before your stakeholders do
Learn how Sentinel automates data observability in minutes
