Data Quality and Ownership is the foundation of all BI Initiatives

datagovernance_ris_03-09-15

How many times have you looked at a report or dashboard and you quickly question the accuracy of what you’re looking at?  If your first reaction is that there is a problem with the software, then you’re looking in the wrong place.  Most Business Intelligence (BI) initiatives fail not because the software isn’t right for the job, they fail because people don’t pay enough attention to the data quality.  To make information actionable for business improvement, the real goal of any BI initiative, you need to have a good data governance program in place.  Only when you’ve got a good handle on the raw data can you turn your attention to how the information is presented.  Taking it another step further, you can’t even begin to think about advance analytics and machine learning without top quality data.

Most companies end up having a data quality problem before they even realize it.  If you’re in a consumer facing business, how many applications do you have that contain the same information about your customer?  Sales will have one set, your account record system another, add in your customer service application and your billing systems and you can see where this can get out of control.  Unless these are all in one application, you’ve got a data quality issue.

In the Commercial and Corporate Real Estate world where I spend my time these days, organizations have multiple systems that contain building information, tenant, square footage, rent, lease information and headcount, just to name a few.  If you’re a rapidly growing company, your first priority is just finding space for your workers and getting your product out the door.  Customer service and time to market are your focus, not the data mastering of your real estate systems as an example.

No matter your  business, you need to setup a good data governance program before embarking on a BI or data analysis initiative for all of your systems. In particular, you need to:

  1. Identify your key data points that are the most important to you.  You shouldn’t boil the ocean with every piece of data as a start, so determine what are the key pieces of information that you care about the most.  This can be obtained from most of the reports you look at today, or the reports & dashboards that you would like to see.
  2. Most likely, some of that information will be in more than one application.  Determine which system will be your master.  This is typically the system that houses the ongoing changes of your key data points.  If it’s a customer, then it’s likely your account management system.  For a building, it’s usually your space planning application where CAD drawings and other up to the minute changes reside.
  3. Determine who owns the data.  The owner is the group that has the business ownership of the data.  In many cases, this is not who is maintaining the data as that’s done as a service by other departments or organizations.  This is one of the most important steps in improving your data.  Without a recognized and agreed to owner, no one is stepping up to ensure the data is accurate.
  4. Once you’ve got an owner, then you can determine who is best to maintain it.  This “data steward”, is responsible for ensuring the information is maintained as accurately and timely as possible.  They work under the direction of the owner when conflicts or questions arise.
  5. What’s your process for ensuring the data stays clean and is updated in a timely manner?  What’s the process for resolving conflicts if there are questions?  Data Governance is not a “create it and leave it” program.  It needs constant nurturing as new data points are added, new systems included or replaced, or new business needs.

With respect to Machine Learning, data quality is a fundamental necessity.  If you feed an algorithm bad data, you’ll get bad results and you won’t even know it.  Don’t even go there unless you’ve got a good foundation with data governance at the core.

There are more tools available today that support a good data governance program. The tools can help highlight inconsistencies, duplicates and anomalies that require attention, and they can be a valuable aid in assisting your data stewards or analysts. Still, these tools should not be deployed unless you’ve got a good program setup, with a top-down organizational buy-in.

Data Governance is not a sexy concept and many organizations don’t take the extra time and effort in focusing on getting it setup. If you’re serious about your Business Intelligence initiative, don’t be that lazy organization. Take the time and effort upfront, and you’ll end up with a more successful BI program in the end.