All Identity Resolution Is Probabilistic – And That’s Okay If It’s Done Right

All Identity Resolution Is Probabilistic – And That’s Okay If It’s Done Right

By Charlie Swift – EVP, Head of Marketing & Account Management, Adstra

If Meta’s recent spate of bad news hasn’t given you pause about the future of identity and marketing, it should. Mark Zuckerberg put a fine point on it: “With Apple’s iOS changes and new regulation in Europe, there’s a clear trend where less data is available to deliver personalized ads.”

Boom! What we’re witnessing is nothing short of an existential crisis for a digital industry that lives and dies by understanding who’s who online.

The future, it turns out, now belongs to new adtech solutions that trade in identity resolution, which helps marketers understand who they are interacting with from first-party data without cookies and universal identifiers — and using deterministic and probabilistic data models to fill in the gaps.

Let’s break that down. Deterministic data is identity information that consumers give a company and is believed to be true — information like a person’s email address, first and last name or a phone number. Probabilistic data is based on probabilities — information about a person’s operating system, IP address, the topic they are browsing, or lookalike audience information that can be assembled to create an approximate profile.

But here’s the thing: There is no such thing as deterministic data.

All identity-based targeting (including what we think of as deterministic) is really just probabilistic. That’s because most identity resolution today happens through a process in which a brand sends out its first-party data (which includes names, emails or phone numbers, as well as data about actions taken on a website, purchase transactions, pages visited, and other behavioral data). So first-party data is already a mix of deterministic and probabilistic. That data then goes into the provider’s black box, where proprietary algorithms generate a match.

But those algorithms are also fundamentally probabilistic. And let’s face it, people are fallible. Lots of first-party data is full of bad “auto” form fill info, burner email addresses, and fat finger mistakes. If used literally, deterministic(ish) algorithms will generally return a mere 10% match.

Inside the identity provider’s algorithmic black box however, it’s possible to toggle the controls to generate the desired outcome (like a higher match rate). And that’s exactly what happens. Identity data providers tweak the results to dial up the probabilistic model so that 70% to 80% of the results return a match. And the problem is you don’t know how your provider is making those probabilistic choices for you.

If you happen to be sophisticated about this stuff, you can ask to see the confidence interval applied to matches where less than three characters difer between two input records. They won’t show that to you and they won’t give you a straight answer either. That’s because in the quest for high match rates, they have loosened the guitar strings to a point where you get a very high match rate and very low accuracy.

It’s the dirty little secret of the identity resolution business.

Marketers don’t have to settle for this. You should be able to see how data matching happens, and what happens when you change the parameters. Even if you don’t want to change it, you should at least know what the choices are. You might learn that for an 80% match rate, you they’re associating people with data that have nothing to do with each other. And that can be a problem when you need a high degree of accuracy. The probabilistic match style of a credit card offer for example, is different than selling a direct-to-consumer subscription service for socks.

The truth is that ID resolution is possible without black boxes if the capabilities are available behind the brand’s own firewall. And once you have that, you have complete control over the algorithm and how loose you want your confidence intervals to be.

Just know that all identity-based targeting is probabilistic. And the sooner you control it from top to bottom, the better your targeting will be.

Please enter your email to
access our thought piece.​​

Please enter your email to
access our thought piece.​​

Please enter your email to
access our thought piece.​​

Please enter your email to
access our thought piece.

Please enter your email to
access our thought piece.​​

Schedule a
Discovery Session

Please enter your email to access our Whitepaper.​

Please enter your email to
access our thought piece.​​

Please enter your email to
access our thought piece.​

Please enter your email to
access our thought piece.

Please enter your email to
access our thought piece.

Please enter your email to
access our Whitepaper.

Please enter your email to
access our thought piece.

Thank you!
Click here to read the White Paper