As companies expand the amount of data hoovered up via their subscribers, a common refrain to try and ease public worry is that consumers shouldn’t worry because this data is “anonymized.” However, time and time again studies have highlighted how it’s not particularly difficult to tie these data sets to consumer identities — usually with only the use of a few additional contextual clues. It doesn’t really matter whether we’re talking about cellular location data, GPS data, taxi data or NSA metadata, the basic fact is these anonymous data sets aren’t really anonymous.
The latest in a long stream of such studies comes from MIT, where researchers explored (the actual study is paywalled) whether they could glean unique identities from “anonymous” user data using a handful of contextual clues. Studying the purportedly anonymous credit card transactions of 1.1 million users at 10,000 retail locations over a period of three months, the researchers found they could identify 90% of the users’ names by using four additional data points like the dates and locations of four purchases. Using three clues, including more specific points like the exact price of a purchase, allowed the identifying of 94% of the consumers. Intentionally trying to make the data points less precise didn’t help protect consumer privacy much