Skip to content

Questions?  ·

The Myth Behind Big Data and Privacy

We know what personalization means and the compromises it imposes on our individual privacy.

Or do we?

This is perhaps the most insidious myth among the technorati: In order for people to benefit from advanced and personalized technologies, they need to compromise their individual privacy.

This idea is remarkably pervasive and damaging, driving both consumers and businesses away from the opportunities of personalization and next-generation information services.

In this post, I’m going to introduce you to the myth and the underlying villain, Big Data. I’m also going to argue that innovation is a much better path forward than evil, or doing nothing at all.

The Myth: Privacy is the cost of personalization

Although this idea is being propagated by countless people, Tom Cochran, CTO at Atlantic Media, summed it up succinctly:

“There is a zero-sum relationship between personalization and privacy. To get the personalized digital experience you want and have grown accustomed to, you have to accept the loss of your privacy.”

(That was last month. You have to wonder if Tom has reconsidered his position with the recent revelations into the U.S. government’s PRISM data collection system and the disclaimed involvement of technology giants like Google, Facebook, Apple, Microsoft, and Yahoo. Not that this problem is confined to the US 1 2.)

The myth is this direct association between personalization, as a category, and privacy invasion. The reality is that the technical approach to user modelling has huge implications on the amount of personal data required to power the personalized experience, and that different approaches vary greatly in the amount of personal data they use.

The villain isn’t personalization. The villain is in how personalization is conventionally approached.

The Villain: Big Data

In order to personalize media or the user experience, you need data that describes individuals and their interests. We call this representation a user model. This is the data that powers personalization.

At scale, user models are much too expensive to build directly, so most companies derive them indirectly using large scale statistical analyses of existing content or user activity (big data).

Big data is the conventional approach to personalization. This approach literally feeds on your personal information, and its pervasive use is the reason why personalization has acquired such a stink.

(For more information on the costs and limits of these big data approaches to personalization, see Where Big Data Fails…and Why.)

The Abdication: Do evil or do nothing?

Recently, I met with a company that has built one of the largest online networks in the world. We were discussing the opportunities in personalization: creating much more tailored and valuable information services for their consumers, and generating new business opportunities to grow their network.

To paraphrase the response: “We don’t do that data stuff. We delete the consumer data as quickly as we get it. We’re super sensitive about privacy.”

You have to admire that stance, and I suspect most of you hold that position as well. You value your customers and you’ve earned their trust. The last thing you want to do is anything creepy, like surveilling their activities or analyzing them like lab rats.

Faced with the unsavory choice of doing nothing or doing evil, most would choose the former. And this perception that we only have two choices is tragic.

Given the glut of online information and its exponential growth, delivering more personalized, automated user experiences is redemptive. But to move forward, we need to put these privacy destroying practices behind us.

The Future: Innovation in Personalization

The future in personalization isn’t a stubborn maintenance of the status quo (do evil) or abdication (do nothing). The future is in innovation; legitimately new approaches to the opportunity of personalization.

Primal is an innovator in personalization and user modelling. Our technology uses a synthetic, computational approach that avoids the monumental costs and creepiness of big data. It isn’t magic, but it is very cool, an approach that we’ve described in over 100 patent filings.

Instead of analyzing large amounts of representative content to derive the data, we synthesize the data directly using only sparse data inputs from consumers. It’s a permissions-based model, it’s transparent, and it puts consumers in control.

If you’re interested in the future of personalization, you only need to be open to innovation and new ways of thinking about the possibilities.

Primal is available to companies of all sizes through our developer services.