The case for Data Governance

Data quality is a real challenge in the UK water industry, especially in the newly competitive non-household sector, where national customer contracts are bringing together information from water companies across the country, with diverse data sets and different ways of working.


Yes, the standards introduced with market opening provide a basis for uniformity but it will take time to raise the overall confidence in data, much of which is many years old. Historical data are being put to uses that would hardly have been considered when the information was first captured.  

Ownership of the data has been rethought, with the separation of billing and customer services introducing a new stakeholder to represent customers’ interests. That creates clear dependency between wholesaler and retailer, each with an imperative to address issues quickly and efficiently.*

Right now, the market is experiencing growing pains as it matures from the legacy 'silo' world to one that is better connected, more scrutinised. There is a lot of cleansing activity still happening but in the long run, investment in data cleansing can be wasteful without addressing the causes of inaccuracy in the first place.

The case for good quality data is clear: more accurate bills, better cash flow, more efficient processes, stronger reputations...  More robust data sets open up more possibilities for machine learning and improved analytics.  That means better informed decisions, made with greater confidence. 

It’s all good - but achieving it can be hard, especially with so much of the critical data being under someone else’s control. It takes discipline and focus; it takes effective data governance.

For the purpose of this introduction, let’s assume that companies do not pick and choose when to resolve issues based on whether the change is in their favour or not. That’s a Game Theory post for another time but an effective data governance framework should help expose whether or not that happens in any case.

So what is Data Governance and what does it look like?

Data Governance is often described as being the policies, processes and responsibilities for data within an organisation. Success is heavily dependent on senior buy-in to what is going to be achieved and how.

Implementing Data Governance usually requires a senior level of accountability for data governance itself.

This person can be CXO level or near - how near often depends on the size of the company. Main thing is that it needs to be a person in a position of authority and who is committed to the outcomes. For now, let's call that person the Data Governance Officer.  Let's also assume it's a bolt-on role, so we can prefix that with 'Chief' if she is already at C-level.  It ought not to matter which part of the business she represents in her 'day job' once things are established - it needs a company-wide outlook, after all.

While she leads the overall initiative, other senior managers take responsibility for 'Data Domains' - which are broad classifications of data into groups. Within their domain, they will act as stakeholders for change initiatives and operational improvements that may affect their domain(s), again on a part-time basis. We'll call these Data Owners; they do not need to be directly responsible for the data in question. Indeed, it is rare that critical data is only created, read, updated and deleted (or archived) - CRUD - within one area anyway so that would be hard to achieve. Given the choice, it can be more effective for a Data Owner to represent (through their usual role) those that use the data rather than those that create or update it.

How many data domains should you have and what should they be?

In short, as many as feel right without being too granular that it makes things cumbersome. All that really matters is that you make it fairly intuitive for people to know what falls within each domain. That said, deciding the domains has been a source of much debate in the governance programmes I have worked on and people can be concerned about the significance of getting it 'wrong'. Yes, it can be hard to change something once it has been communicated out but it can be undone if really needed. There is no 'right' answer so do not agonise unduly. Have a good rationale and follow it through.

In fact, the idea for our Market Tube Map came out of thinking about how to best answer this question.

As much as the Tube Map helps to lookup specific transactions, it can also suggest how to establish domains. Sure, organisations may need extra ones if they want to include more than just the market-facing data, such as employees, assets and so on. But there must be commonality in the market space and a common framework will make it easier to develop effective practises while learning from the experience of others. The consistency can also be better for benchmarking and working with other industry parties. 

Water Competition Tubemap
Ok, so where next?

It's all a bit theoretical still: what we need is to apply it to real life.  We do that by extending the metadata to the major processes in which we need to be successful (e.g. for a retailer one of those would be registering a customer).  We call these the 'apex' processes because you can drill down further into more specific processes and subsequently into the tasks and activities that impact on the data.

The main thing we are looking to understand here is who is impacting upon the data in each domain. Remember CRUD from above?  We capture that at apex level first, then drill down into specific processes, which can usually be assigned to specific parts of the organisation. Now we start to get an objective sense of the stakeholders to our data domains and understand who is affected downstream by what is done in a given process - and who is going to be affected by the awesome business changes we have planned.

We have already assumed a little 'body of knowledge' at this point: we knew enough about our data to classify it and assign ownership to it. In data management, this is called 'metadata' - data about data.

We were able to do this because of two things: the market data has been specified in the Codes (mostly in CSD0301) and the Tube Map gives domain-wise context to that data.  We'd want to extend that to reflect criticality of different data but, again, that can be informed by a deep understanding of the market processes: Brodick has a starter for that also.


At this stage, we have something that looks like this:

We have already assumed a little 'body of knowledge' at this point: we knew enough about our data to classify it and assign ownership to it. In data management, this is called 'metadata' - data about data.

We were able to do this because of two things: the market data has been specified in the Codes (mostly in CSD0301) and the Tube Map gives domain-wise context to that data.  We'd want to extend that to reflect criticality of different data but, again, that can be informed by a deep understanding of the market processes: Brodick has a starter for that also.


At this stage, we have something that looks like this:

What do we fix first?

Does anything even need fixing? Probably. And we know which areas need focus because we have suitable measures in place to track how things are going. Plus we keep tab on the issues that we face and the steps we are taking to address them. At least, if we didn't do that before we need to start doing them now. That creates a bit of a feedback cycle for us to monitor improvement against our priorities and tweak our approach as we go.

Some things will take longer to address - maybe we have dependencies on other parties, on system change or perhaps on the market rules being updated.  The monitoring framework helps us manage expectations about how things are and how they will improve but it is a useful reminder that not all of the apex processes relate to the core business: we need to consider business planning, managing change and the technical aspects of data. So we include those areas into our framework as well.

Now that we know how a broad community gathers, stores and uses data we need to ensure that people are working to a common understanding of what is expected of them.  A good option for this is to have data champions in most areas of the business to help with these messages.  It's another part-time role and should only require periodic input but often they will be someone who understands their processes well and is regarded highly by their peers.

A data governance framework

Now that we have all the key elements in place, we have a framework that now looks like this:

Ok, we have simplified the process somewhat in this summary. Achieving all of that would take a lot of hard work and a very strong commitment at the outset. That commitment would be based on benefits that are very difficult to quantify. Yes, we can measure costs of failure fairly well, at least in terms of issue resolution and penalties. But quantifying the value of better informed decisions or enhanced reputation requires far more conjecture and so are treated as non-financial benefits, usually. 


In short, the commitment is a leap of faith. That is not to say it is wrong but is not right for all companies, certainly. For some large companies, taking a methodical, waterfall-style approach may be what's needed, not least so that it can attract the investment and support it requires by following the same process as for other change initiatives. A stage-by-stage approach can help bring stakeholders along and gain traction.

And there are tactical options: you could wrap governance around one or two domains, for example. 

That way, the value can be demonstrated before extending the investment. That can work but often is predicated on an 'all or nothing' basis: projects either continue to completion or get stopped in their tracks somewhere along the way.

We believe that there is a more efficient, more effective approach that can mean calibrating the framework to offer exactly what is needed and no more. We think of this as 'Lean Data Governance'.

Lean data governance

Conceptually, lean data governance inverts that triangle above: don't start at the top and work down but instead build upwards and outwards while solving real issues, implementing incrementally and organically. Try certain aspects: if it works, adopt it. If not, why not? Do you drop it or change it? In time, you will land on an approach that is exactly right for your organisation and culture.

Some of the framing tools and techniques are still needed - metadata in particular. But that can be targeted to focus on the key challenges being experienced rather than casting the net broadly.  Thinking about the overall framework helps to explain the road ahead but doesn't have to constrain you to a particular pace or prevent you from resting up and consolidating once you feel you have gone as far as you need.


How Brodick can help

We have designed a structured programme - GOVERN - that will help organisations implement a market-focused Lean Data Governance framework.

The programme is expected to last 12 months, with our light touch involvement further reducing gradually. Most of the work is done by your own teams with support, guidance and tools provided by us to help you on your journey. You can shorten or extend the programme as required or 'top up' how much support you require.

We provide:

  • Kick-off workshops and sessions to help make a strong start, including 'deep dives' into specific issues

  • Pre-packaged spreadsheet-based metadata based on market documentation and our insights

  • Tools and processes for how to capture who does what to which data

  • On-site support days to help with ongoing implementation advice and assurance

  • Off-site support time to help with ongoing implementation advice and assurance

We also offer an additional service for performing analysis of Market Data Sets to complement your issues and quality tracking work and can support you in developing measures for your internal data sources.

We take a proactive approach to developing the framework, tools and materials further.