What is metadata?
In my introduction video to data governance, I talk about the need for metadata - usually described as "data about data". But what does that mean exactly?
For me, there are two types of metadata: what I call “contextual” and “architectural” metadata. I find these can often get confused, especially in asset management environments.
Contextual metadata is information which gives you more information about a specific instance of data
Contextual metadata is information which gives you more information about a specific instance of data. For example, a polygon in a GIS system might have metadata that tells you when it was last updated. Who by. A version number, that kind of thing. If you watch crime shows on TV, they’ll talk about wanting metadata for text messages: Who sent it? To whom? When? Where from? These all tell you certain things about a particular object or event - the specific polygon or text message. This tends to be what metadata most people have heard about.
Architectural metadata is information about all instances of a certain data item or information type.
Architectural metadata on the other hand takes a step back. This is information about all instances of certain data or information. What format is the data in? Are there valid values that the data can have? If it’s a number, how many decimal places? That kind of thing.
In data governance, we’re interested in the latter of these types: architectural metadata. In fact, for governance purposes, contextual metadata is not so much metadata as just data. In the same way that an 'address' entity has data attributes like house number, street name, post code... so it is for GIS polygons and text messages.
For governance, a GIS polygon is a data entity, which has some data items - some attributes of it - that include when it was last updated, who by, what version, etc. A text message is a data entity that has certain attributes: the message, the sender, the receiver, the time it was sent, the location it was sent, etc. These are just more data items in a related entity.
So unfortunately what most people think of when they hear metadata is not necessarily what the data governance team is thinking of. So watch out for that confusion.
Going beyond architectural metadata
While the architectural metadata may look technical in nature, there is far more that can and should be known about the data that a business has. We'll end up establishing who uses what data, how they use, what information is the most critical. We'll start to look at what controls are in place now. What controls are needed in the long run. What 'good' looks like. Metadata is basically a body of knowledge for all aspects of your data and underpins both the understanding of the current state of play and the move towards achieving business goals in the future.