Factors of Data Quality Decline
In this post, I look at how a theoretical perspective on what causes data quality decline can help with very specific, practical steps to improving data integrity, controls, processes and knowledge.
Start with the basics
Data quality errors exist because something broke down somewhere at a data "moment of truth", which is any opportunity to capture, correct or update data that went awry somehow.
Let's say some event happens that could affect a data entity, such as needing to create an address in a CRM system for a new customer. Whatever the event, the required outcome is logically either 'do nothing' or 'do something'. In our example, the desired outcome is clear: do something - i.e. create the address.
When the event response happens, however, there are three options: we can still 'do nothing' as before but doing 'something' can mean doing it correctly or incorrectly, such as with the wrong house number. we can summarise that in grid form and see how well the event requirement and event outcome align:
We can see that two of the six combinations are fine: we do nothing when we should or when something needs to be done, it's done correctly. Another scenario is not applicable as it doesn't really exist - or else is a duplicate of correctly doing nothing.
This leaves us with the following error combinations:
No action was needed but an action was taken, resulting in data becoming incorrect.
An action was needed but none was taken, resulting in obsolete and incorrect data.
Action was needed but the wrong action was taken, resulting in incorrect data.
The factors, summarised
Ok, process performance fluctuates but in principle, why do those things happen?
The table below summarises the causes of these incorrect outcomes (where 'x' identifies a resultant error being created by a given causal factor):
Passive Decline tends to be down to individual users, whether consciously or sub/unconsciously:
Apathetic decline is where a user knows they ought to take an action but chooses not to, which can often be because of the time or complexity that this may involve but can be a case of not caring enough or failing to understanding the impact of poor quality.
Adversarial decline is where a deliberate error is caused for whatever reason. While rare, these can affect any of the requirement/outcome combinations. These will tend to be short-term in nature though can cause quite significant reputational and financial damage.
Misjudged decline is where the user takes a well-intended action or non-action but has interpreted the scenario incorrectly and therefore followed a wrong course. These may be isolated but can be common if the misjudgment is based on an underlying misunderstanding (and which can also affect groups of users). That said, misjudgment is more likely for infrequent processes so trends may be harder to establish.
Misinformed decline is where a whole team's processes are based on a misjudgement of how to handle a specific scenario or the training given is erroneous in some way. These cases tend to affect lower frequency processes but can have a significant effect on data before the misunderstanding is rectified and they impact upon entire teams.
Active Decline tends to be more systemic. It can be constructive - where the action is a deliberate choice - or it can be constrained, which is where there are enforced limitations on performing the outcome correctly:
Inactive decline (although seemingly an oxymoron in the 'active' category) is where an organisation chooses not to act on certain events or aspects of them. Typically, this will extend to particular fields within an overall update rather than any update at all, often because the organisation does not value or maintain the affected data items.
Compulsive decline is where over-action happens, most often as a consequence of data fix activities, where a high volume update sweeps up cases that should not have been included or otherwise apply incorrect values. This tends to be episodic but can have a significant impact when it occurs. This category also includes system defects that enforce incorrect data updates within a process.
Blocked decline is where the the correct update is not possible because of constraints, such as system capability or for users having the necessary access to perform the actions they wish to take. It is similar to inactive decline and often arises from prioritisation calls on system scope.
Dependent decline is where the update is performed by an external party (whether or not that party was also the performer of the original event that triggered the requirement). This can include significant delays in getting updates performed or having errors rectified.
Putting the table to use
So what does this tell us?
You may see the table as somewhere between a thought experiment and metaphysics. Basically, something that may have theoretical value but not so much practical application. Unsurprisingly, I don't take that view.
Yes, for any given error (and we'll come to detection later) it may be difficult to isolate why something happened. But through your issue management processes, you may see some trends that help with that, especially with the higher impacting causes.
Similarly, when you are undertaking process improvement initiatives you can use these threats as cues for how to assess, define and prioritise improvements. In fact, you don't need real examples for that. You could use the table as a series of prompts to challenge how effective your operational controls are (or your procedures, your training and so on).
If you have deployed data governance, there is a good chance that you already understand your data priorities as part of developing robust metadata (your 'data about data'). Considering these causal factors might inform your risk definition and mitigation approaches, while also helping you develop your 'as is' and 'to be' technical landscapes.
Good data governance can help you prioritise your data, understand it better, maintain the integrity of it, improve the efficiency of your processes and inform your projects' scope. Indeed, good governance can be crucial to changing your culture to one that is data driven, which in turn reduces the likelihood of a number of these causal factors significantly.
Error detection and correction
Once an error has occurred, you could argue that a new event requirement exists automatically: to correct the error. Although that state persists, the next 'moment of truth' occurs upon any subsequent interaction with the same data. And it's when that moment of truth happens that we have our opportunity to 'do the right thing'.
Whether that is possible or not often depends on whether the error is clear to the process/user and the context. If the user cannot see the error - or what should go on its place - then the problem persists. By context, I'm referring to the original causal factor: if blocked decline was in evidence before, so it may also be now; if the complexity and time to fix it before put the previous user off making an update, the same may be true now.
We know poor quality data can be very hard to spot. Sure, we may get invoices returned from having an incorrect address, that kind of thing, but it may not always be apparent what the correction should be to. Costs of poor quality can be very high and prevention is the most effective way, where possible - and understanding these causal factors helps with that. That is not to say that all errors can or should be eliminated. Clearly, how much sophistication to build into systems is a judgement. But if you are relying on people to take the correct action, understand what support they need to help them know what to do, when.
There is still a useful design or analytical question: if this data is wrong, how would we know? It may be that there is no practical way of knowing that. It may also be that there is no natural or reliable feedback loop to identify it but posing the question helps to understand the inherent risks. It also helps in the overall paradigm shift towards managing quality and understanding the underlying integrity of your data - another topic in its own right.
If you could identify errors and what their corrective value should be, you could take a proactive approach to fixing them. Taking a reactive approach would be to rely on feedback processes, which may be from customers and/or external parties. But it is also possible to take a hybrid approach, where you can highlight the possibility of error (or rather the probability).
With large datasets, analysing changes and the trends within them, especially if you are able to isolate samples where you know whether they were found to be incorrect can help you determine where errors are most likely, especially in the more endemic causal cases. Learning algorithms can look across a range of data items and interpret patterns to flag cases that may prove problematic.