The Big Problem of Big Data in an Online World
It seems one cannot have a discussion about Big Data without bringing up the proverbial needle in a haystack analogy.
You know the comparison.
In a stack of Big Data how do you possibly go about finding the needle, the piece of key insight you are looking for that will allow you to anticipate your customer's’ needs and delight them.
In too many organizations, however, this take on the haystack/needle hunt is putting the cart a bit before the horse. After all, how do you know the needle--should you be able to even find it--is what you really want in the first place?
Stick (pun intended) with me for a minute and consider the following real-world, scenario I watched play out with a company not long ago.
The Needle May be the Problem
This company rolled out a completely new version of their web app. Teams were busy testing and monitoring the impact of the redesign. Product Management claimed the rollout was a great success, citing customer feedback and user experience. They confidently began to push to sunset the old version. In an effort to be data-driven, analysts were asked to produce a few reports around the total visitor sessions for the new version vs. the old version of the app over the past months.
“I was shocked by the numbers,” the lead analyst told me, “It seemed that the sessions for the old app version were not as low as they ought to be in comparison the the new version, they were only slightly lower. More interestingly, other key metrics we were interested in were higher!” The numbers didn’t support the PMs’ recommendation, and I had no choice but to recommend against sunsetting the old version.”
The needle in the haystack suggested that user adoption for the new app version needed work.
But the needle had problems. Big ones.
An audit of the sites in question found that the old app pages had 300-500% data inflation caused by duplicate web analytics tags. If a web analytics tag is coded in multiple places on a page, transposed from other pages or duplicated in some other way, this sort of thing can happen regardless of whether a tag management solution is being used or not.
Unfortunately, this knowledge wasn’t available when the analyst made his recommendation, and when executives, looking to the flawed needle from the haystack, made a revenue-impacting decision, based on bad data.
When it comes to Big Data, the needles in the haystack we should be the most concerned about aren’t the gems of insight we all so eagerly want. That will come later. For now, we need to consider the needles as a negative, as the errors that slip in through a Big Data system and Big Data processes most companies are not yet effectively governing.
Too Many Haystacks, Too Many Needles
In order to trust your data, you need to know it is accurate, which means you need to catch the errors in the analytics implementations across all your digital properties before your analysts are passing bad information into your board meetings.
But it is hard to find a needle (errors) in a haystack (high volume flow of data)and it is much harder when you have many haystacks (data flow through multiple technologies and platforms).
Errors Creep In
How do all those needles infiltrate all those haystacks in the first place?
Errors creep in because change is constantly happening somewhere in your digital properties. And, left unchecked, errors can snowball and pose a serious hidden financial risk, resulting in the wrong decisions being made or the right opportunities being missed.
You are left with two options: either double check and weed out the needles, or brace yourself for the risks you know are lurking in your Big Data, and the costs it will take to incur them.
Questionable Data is Expensive
The costs of making poor decisions based on poor digital data have wide-reaching ripple effects across an organization. Poor data quality fosters competitive disadvantage, bad strategy, lost productivity, customer relationship and financial loss. With some reports from organizations like TDWI estimating that poor customer data costs $611 billion each year for U.S. firms.
Clearly, with so much at stake, web analysts dedicate significant portions of their time to checking their data collection technologies and implementations.
However, manually inspecting analytics code, spot checking data collection and vetting reports before they can be used still drains time, resources and, ultimately, revenue.
Companies spend too much time preparing data, when more time should be spent analyzing it.
According to Forrester research, 42 percent of surveyed analysts spend more than 40 percent of their time vetting and validating data. For executives, 70 percent of those surveyed spend 40 percent of their time vetting and validating data.
The Biggest Challenge
So how do we eliminate the needles in our haystacks and reclaim the time spent torturing and vetting data by hand? More comprehensive data governance technology and processes seem to hold promising answers, but organizations have to first overcome the biggest challenge to data quality that they face: themselves.
“The biggest problem organizations face around data management today actually comes from within,” says Thomas Schutz, SVP, General Manager of Experian Data Quality, “Businesses get in their own way by refusing to create a culture around data and not prioritizing the proper funding and staffing for data management.”
The Way Forward
Organizations, then, can no longer afford to make a half-hearted attempt at data governance. They can no longer live with the needles in their haystacks.
Looking ahead, companies will continue to be faced with more complex Big Data challenges to be solved, and ensuring they are wrangling accurate data in the first place will be key to how they rise to each challenge in order to maintain a competitive edge.
There are five key data governance trends savvy digital marketers, IT leaders and enterprise executives should consider embracing if they haven’t already.
1.Data Governance Leadership
Top-performing companies are 64% more likely to appoint a Chief Data Officer and enjoy 10%+ revenue growth for doing so, compared to low-performers who only net less than 4% revenue growth, according to 2015 Forrester Research.
2.Data Quality Technology
MarTech continues to boom, and there are no shortage of proven technology offerings available. But integrating multiple data collection and tag management tools under one umbrella is still easier said than done. Data quality solutions that provide comprehensive data quality audits and control keep a pulse on the performance of your technologies, ensuring the ROI of your MarTech investment.
3.Making Big Data into Transactional Data
Clearly, it is not enough to simply have Big Data in your organization’s back pocket. It’s about making the data usable, accurate and addressable for all stakeholders in your company. This requires setting business process governance policies in place that specify how data is collected, synthesized and organized--and these processes must be evaluated and updated on an ongoing basis.
Savvy enterprises are quick to embrace software automation to configure data governance processes and requirements, allowing the talent on your teams to be utilized for more strategic initiatives and benefits.
5.Increased Security Emphasis
More data collection vendors and more technologies on your digital properties also means more attention must be paid to keeping control of your data and preventing data loss. Companies who do not make big data protection from theft, misuse and abuse via third-party technologies stand to lose customer relationships.