The False Precision Problem: Why Digitised Data Can Be Worse Than Paper
A farmer writes "about 10 kg" in a field notebook. A data entry clerk types 10.00 kg into a database. A compliance algorithm compares 10.00 to a regulatory limit of 9.95 and flags a violation. An inspector issues a warning. The farmer is confused, because the actual amount was closer to 9 kg, and "about 10" was a rough estimate rounded up for simplicity.
This is false precision: the phenomenon where digitisation adds apparent accuracy that the original data never possessed. It is one of the most insidious data quality problems in modern organisations, precisely because it looks like an improvement. The data is clean, structured, and machine-readable. It just happens to be wrong in a way that is invisible to anyone who was not present when it was created.
The Spectrum of False Precision
False precision is not a single error. It is a systematic distortion that occurs across multiple dimensions whenever analogue information is converted to digital form.
Quantities: "A handful" becomes 50 grams. "A couple of litres" becomes 2.00 L. "Roughly half" becomes 50.0%. The original estimate communicated uncertainty. The digital value communicates exactness. Every downstream calculation inherits this false confidence.
Timestamps: "In the spring" becomes 2024-04-15. "After lunch" becomes 13:00:00. "A few days later" becomes exactly 72 hours. The original expression conveyed approximate timing. The database demands a precise date, so someone picks one. From that point forward, the system treats it as fact.
Identifiers and codes: A farmer says "winter wheat" meaning the general crop growing in a particular field. The system requires an EPPO code, so someone selects TRZAW (Triticum aestivum). But the field contains a mix of varieties, and the farmer's intent was descriptive, not taxonomic. The code implies a precision of identification that was never intended.
Spatial data: "The field near the road" becomes a GPS polygon. "About 200 metres from the water" becomes a buffer zone calculation to two decimal places. The original spatial reference was contextual and approximate. The digital representation suggests surveyor-grade accuracy.
Each of these transformations individually might seem harmless. But when they accumulate across a dataset, the result is a system that appears rigorous while being fundamentally disconnected from the reality it claims to represent.
Why This Matters: Downstream Consequences
The damage from false precision is not in the data itself. It is in what happens next. Modern data systems are designed to trust their inputs. Analytics tools calculate averages, trends, and thresholds assuming the numbers they receive are meaningful to the precision displayed. Compliance systems compare values to limits assuming both are measured to the same standard.
Consider a regulatory compliance scenario in plant protection. A product is approved for use at a maximum dose of 1.50 L/ha. A farmer applies "about a litre and a half per hectare," a reasonable and compliant application. But the spray record, reconstructed from memory and entered into a digital system weeks later, shows 1.55 L/ha. The compliance system flags this as an exceedance. The farmer faces questions about a violation that never occurred.
The error is not in the farming. It is not in the regulation. It is in the digitisation process, which converted an approximate, compliant value into a precise, non-compliant one. And because the digital record looks authoritative, it is the digital record that gets believed.
This pattern repeats across domains. In manufacturing, approximate measurements entered with false precision trigger false quality alarms. In business metrics, estimated figures reported with decimal places create the illusion of performance changes that are actually measurement noise. In healthcare, approximate patient-reported data entered as precise values leads to clinical decisions based on phantom precision.
The Statistical Thinking Gap
Statistical thinking requires knowing the uncertainty of your measurements. Every measurement has an associated uncertainty, a range within which the true value is likely to fall. A kitchen scale might measure to the nearest 5 grams. A laboratory balance might measure to the nearest 0.01 grams. The number of decimal places in a measurement should reflect the instrument's precision, not the database field's capacity.
When data is digitised without preserving measurement uncertainty, the information about data quality is permanently lost. A value of 10.00 in a database could mean "exactly 10.00 as measured by a calibrated instrument" or "somewhere between 8 and 12 as estimated by a person from memory." The number looks the same. The meaning is entirely different.
Walter Shewhart, the father of statistical quality control, emphasised that data without context is meaningless. A measurement is not just a number. It is a number from a specific measurement process, with a specific uncertainty, taken at a specific time, under specific conditions. Strip away that context, and you strip away the ability to interpret the data correctly.
Most digital systems strip away all of it. They store the number. They discard the story.
The Organisational Blind Spot
Organisations invest heavily in data quality, but almost always focus on the wrong end. They clean data after collection: removing duplicates, correcting formats, filling gaps, standardising codes. This is necessary work, but it cannot fix false precision. A value that was imprecise at collection and then digitised with false precision will pass every data quality check. It is the right format, the right type, within plausible range. It is just not what it claims to be.
The fix must happen at the point of collection, or more precisely, at the point of digitisation. Systems need to capture and preserve uncertainty. This can be as simple as a confidence indicator (estimated vs. measured), a precision qualifier (approximate, exact, calculated), or an explicit uncertainty range. The technology for this exists. What is often missing is the organisational awareness that it matters.
Practical Approaches
Addressing false precision requires changes at multiple levels:
Data model design: Include fields for measurement method, precision level, and confidence. A dose of "1.5 L/ha (estimated, +/- 0.3)" carries fundamentally different information than "1.50 L/ha."
User interface design: Allow users to express uncertainty naturally. Dropdown options like "approximate," "measured," and "calculated" take seconds to select and add enormous interpretive value.
Analytical practices: When performing calculations on data with mixed precision, propagate uncertainty. An average of ten approximate values is not precise to two decimal places just because the arithmetic produces one.
Compliance logic: Build tolerance into comparisons. If measurement uncertainty means a value could reasonably be anywhere in a range, the compliance check should account for that range rather than treating the point estimate as exact.
Training and culture: Help data users understand that precision in a database does not equal precision in reality. This is perhaps the hardest change, because it requires people to question numbers that look authoritative.
Why This Matters to Us
We have encountered false precision firsthand in the plant protection domain, where "about 1.5 litres per hectare" becomes 1.50 L/ha in a database and triggers compliance flags for violations that never happened. At TaiGHT, our quality and statistics background means we think about measurement uncertainty as a first-class concern, not an afterthought. When we build data systems, we design them to preserve context alongside values.
This is the intersection of industrial measurement discipline and software engineering. If your organisation digitises data that originated as approximations and you are concerned about what gets lost in translation, we have practical experience with exactly that problem.
References
- Shewhart, W. A. (1931). Economic Control of Quality of Manufactured Product. Van Nostrand.
- Deming, W. E. (1986). Out of the Crisis. MIT Press.
- Hand, D. J. (2020). Dark Data: Why What You Don't Know Matters. Princeton University Press.
- Wheeler, D. J. (2000). Understanding Variation: The Key to Managing Chaos. SPC Press.
- Redman, T. C. (2008). Data Driven: Profiting from Your Most Important Business Asset. Harvard Business Press.
- JCGM (2008). Evaluation of Measurement Data: Guide to the Expression of Uncertainty in Measurement (GUM). Joint Committee for Guides in Metrology.