3 Barriers To Data Quality And How To Solve For Them
Today, 80% of marketers say audience data is critical to their digital advertising efforts. Additionally, the same research noted that another 53% have increased their annual spend on data-driven ads. As audience data’s importance continues to grow for media buyers and sellers alike, concerns over quality have dramatically increased. In fact, 84% of CEOs worry about the quality of data within their organizations.
As a result, the industry is demanding more transparency and accuracy from the audience data they use to transact and target media. According to Jason Downie, vice president and general manager of data solutions for Lotame, here are three of the key problems holding back data quality and how CMOs can solve for them.
1. Confusion: Lack of a Definition of What “Quality” Means
Do marketers want high-quality data? Without hesitation, CMOs will say yes. However, they quickly change the subject to the kind of audiences they want to reach and what they want to pay. What they may not realize is that their ability to use data to reach those audiences hinges on how we define quality.
So what’s really meant by “data quality”? As it turns out, quality is an incredibly subjective word. That means everyone uses it differently. In the absence of a clear definition, marketers are plagued by several questions:
- Labeling: What are the ingredients? What’s “in” this data set?
- Provenance: Where is the data from?
- Efficacy: Does it work?
- Accuracy: Who is it?
Now, accuracy is the most common way CMOs use the word “quality.” Simply put, marketers are asking whether the information contained within the data files is correct. If the data points indicate the target audience is males, ages 18 to 34, are you actually reaching men between those ages? Not surprisingly, this speaks to one of the longest-running issues in data-driven marketing.
According to Downie,
Quality inputs return quality results. Therefore, it’s imperative that the data marketer’s use is accurate. Yet, accuracy needs to be considered relative to both targeting and measurement. It’s not one or the other.”
2. Data Collection: Logistical and Operational Challenges
Audience data accuracy among data providers is an industrywide challenge. There are multiple barriers to success for a wide range of reasons. First, data onboarding is the preparation of raw data from multiple, disparate sources that are both internal and external. It’s a complex process that involves huge volumes of data. As a result, onboarding is prone to errors. These errors include duplicate data and inconsistencies across formats and standards.
Second, the impulse to scale has created a situation where data providers are adding attributes to as many cookies and mobile IDs as possible. While none of this may be occurring with dishonest intentions in mind, some providers may choose to expand what types of data “fit” into a certain segment. After all, they want to sell more data. However, the end result is less accurate data. Cookie stuffing might help sell audiences, but it doesn’t credibly support advertisers.
Third, non-human traffic (NHT) is a problem that costs nearly $10 billion annually. It creates negative outcomes across the digital ecosystem. For publishers, NHT inflates first-party audience data. The results are wasted bandwidth, content piracy, fake traffic, and skewed site analytics. Also, NHT damages marketers and agencies by frequently clicking and engaging with digital advertising, ultimately decreasing campaign effectiveness.
Additional Challenges Related to Data Collection
Fourth, wrongly labeling audience segments is a problem for the digital data ecosystem. For example, when a person reads an article about Jay-Z that mentions an Audi, it doesn’t necessarily mean that the reader is interested in buying an Audi. Nor does it mean he or she is interested in automobiles as a category. Interpretations can vary. Therefore, these mischaracterizations are possible from any data provider. According to Downie, the best way to combat it is “for the advertiser to be well-informed on the categorization of data segments they use. This way, they can remove segments that aren’t in line with the campaign targets.”
Finally, the last data collection challenge involves shared PCs. What happens is that shared PCs in a household can make a 100% match rate — the percentage of users in an audience segment that a DSP will recognize — impossible. Therefore, when an advertiser buys 10 million impressions in a specific demo, he or she should expect to match at a rate that is slightly less than 100%. Yet it will still be significantly more than what they’re likely to find on television.
3. Benchmarks: Limited Accuracy Standards
Measuring and benchmarking data quality and accuracy in digital continues to be a hurdle. This challenge is driven by fragmentation across vendors and a lack of standards by key third-party auditors like Nielsen and comScore. What’s more, Nielsen and comScore only benchmark demographic data. There are currently no benchmarks or industry standards when it comes to evaluating interest and intent data.
The solution, according to Downie, is transparency.
To address data quality and accuracy, CMOs need to have candid conversations with data providers. This includes asking about the preciseness of their audiences in terms of the degree of curation. Also, it’s critical to determine the trustworthiness of the sources.”
Additionally, this greater detail from partners should be about what they’re selling relative to the source, offline data versus online, offline-to-online matching, and qualifying segmentation. This should be the standard information available from any worthwhile provider.
Other Data Questions CMOs Should Be Asking
However, CMOs and their teams can dig deeper. Downie lists additional questions every marketer should ask when considering data accuracy with existing or potential partners, as well as the optimum responses:
- How human is your data? The best providers have identified and eliminated bots.
- How are you testing for cookie stuffing? Providers need to constantly evaluate whether an individual seller is adding too many behavioral attributes to profiles.
- How predictive is the data? If the data helps predict multiple different profile attributes, it’s higher-quality.
- How on-target is the audience? Providers must detail the specifics about their audience segments.
- How are audiences curated, and how are providers doing it? Curation is key. They should be doing it by time/recency, user confirmation, and predictiveness.
- What about performance? Buyers need to get information on viewability results, demographic results, and more.
Delivering on data quality is — and will always be — an ongoing process. From bad onboarding to shared PCs and NHT, there’s no simple solution. However, when CMOs command more transparency around the data they’re buying, they can be more confident about data accuracy.
This article originally appeared on Forbes.