Data Brokers

Data brokers: data vampires that act silently, speculate with our information and manipulate our lives.

Do you know companies like Acxiom, Criteo, Equifax, Experian, Oracle, Quantcast, Tapad? I'm sure many of you don't know them, but they know us very well. They are one of the main culprits of big organizations using profiles to make decisions that can seriously affect our lives, and our family's.

What happens when we ask for our profile information from the biggest vampires in the data landscape that spy on us day and night: the data brokers?

Privacy International filed complaints in November 2018 against several companies: Acxiom, Criteo, Equifax, Experian, Oracle, Quantcast, Tapad. Did you know these companies? I’m sure many of you don’t know them, but they know us very well.

A hint, if before accepting the cookies you walk through the list of data vampires that reside there, you will locate these data brokers.

These companies on which Privacy International focused its formal complaints are data brokers, credit references (they track your buying power), agencies and advertising companies.

The complaints were based on three points:

  1. On Privacy International’s issuance of 50 profile access requests.
  2. Based on an analysis of these company privacy policies, and
  3. On the basis of what it claims to be its marketing activity

A person from Privacy International obtained a response to the request for access to Quantcast. They gave her data from her browsing history linked to a unique cookie identifier (ID). In addition, she also obtained certain inferred data. Which ones?

Her gender, number of children, her education, her income. And what they also obtained in response to this application is a great deal of data from members of other data brokers, such as MasterCard or Experian.

All this information was located in segments which, in turn, were linked to unique identifiers, a cookie ID, which was linked to a browser.

Who asked access to this information? A young, single woman with no children. Her information was inferred in the form of a profile of a 50 year old married man with children and high purchasing power. Nothing to do with her.

The reasons Privacy International focused their complaint were mainly two:

  1. A recurrent abuse of legitimate interest as a legal basis for data processing, and
  2. Profiling. In many cases, these companies have not complied with basic principles of data protection, transparency, fairness, purpose limitation, data minimization, accuracy and the need to have a legal basis.

The reason they focus on data brokers is that WP29 in its guide 2016/679 (wp251rev.01) on automated decisions and profiling, mentions data brokers, defining them as follows:

data brokers

From a legal point of view, what could be the next steps?

FIRST. A law enforcement action that sends a very clear signal to companies, because the law is being applied very differently. There is a big deficit of compliance with the RGPD.

SECOND. We need precedents that clarify ambiguities in the law. We hope that this is in the public interest and that this happens in a way that protects people. In short, we need more case law, more legal cases.

Privacy International has only scratched the surface, as this complaint of access requests is based on publicly available information. This is the least we can find. People should be encouraged to conduct similar investigations and also to file complaints.

THIRD: The scope and applicability of Article 22 GDPR (about which we can refuse to have our data subjected to automated decisions or profiling). For example, Facebook has said that the processing of your data has no place in Article 22 GDPR.

Most information from data brokers is in turn sold to other companies for online advertising. The picture is not simple. It is a VERY COMPLEX ecosystem.

Nothing better than a well-structured computer graphics to make us understand better what we are talking about.

The Polish foundation Panoptykon made this infographic in which you can see what happens to every user that is using any online service. It is overwhelming:

Source: “Who really targets you”.

What is the first thing we can say about this reality? Well, that each of the users that use any internet service has a profile that, in turn, is a layered structure, as we see in this infographic.

First layer, it is composed by the information we share. Examples of this are user name, location, friends in networks, uploaded photos, metadata of the photos, test delivered, fingerprints, card number (ecommerce) or hashtag used. Examples only.

Second layer, what you tell them through your behavior. The geographic area in which you move, ads seen, interactions you have ignored, frequency of internet connection, pattern of purchases (routines), tracking of mouse movement (you have read well). Typing speed, operating system, IP addresses, unique identifiers, SMS history, phone call history, and much more.

The third layer, what the algorithm assumes or infers about you (automated decisions and profiles). Ethnic affinity, where you live, new job, IQ level, if you are sick or have overcome an illness, if you want to date, if you are a compulsive shopper, if you have moved, if you are a working mother, if you like junk food or pornography, if you like to get a manicure or pedicure… and so on, even more inferences.

Again, these are algorithmic ASSUMPTIONS.

We must be clear about one thing, this structure is made up of data that we do not control. That is, we do not have the control to decide that we DO NOT WANT to share the data of the first layer.

By the time we accept the privacy and cookie policies, it is too late.

What data can we obtain if we demand access to our data? Only the first layer of this infographic. And forget, for now, about getting data from the second layer, which is already an analysis of our behavior, and even less from the third layer, which is the inferred data.

This is data that we have NEVER PROVIDED, but rather data created by algorithms based on the first two layers.

And we can totally disagree. They can be discriminatory, or very sensitive data that could have been interpreted after reading online content about some disease, and we can be labeled in our profile as a person suffering from that disease.

What is the response of these data brokers when we want to exercise our right to access our information? They usually say, we don’t know who you are. As a user, I don’t know what my ID is in their database. Specifically the “Advertising ID”.

If I don’t know this number (which they will never give us) I cannot access my information, and it is very difficult to access because of the way our system is designed.

Let’s talk about another very good source of information, something that fascinates me and that I think many of you are going to love too: WIKIDATA. It is made with a collaborative model. Come in, take a look and if you can add some valuable information, don’t hesitate.

What is wikidata for? It provides information about technology companies regarding the use of their data. Information such as who is responsible for processing the data, or the email address of the DPO, and the data the companies are using.

Let’s try it out with UBER. This is their wikidata page, and let’s see what data they use about their users:


Making this map of this type of company is very useful for other people because they can work on it and reduce the costs of carrying out services that preserve privacy. As an example of use.

Now, I want to show another infographic that classifies the complexity of digital marketing made by Scott Brinker.

The infographic classifies 6.242 unique marketing technology providers in 49 categories, which are in turn classified in 6 large groups:

Marketing landscape

The 6 big groups are:

  1. Advertising and Promotion
  2. Content and Experience. The technology that enables the customer experience, interacting directly with the end user.
  3. Social and Relationships
  4. Trade and Sales
  5. Data
  6. Management

And the five most important categories:

  1. Sales automation and intelligence (220)
  2. Marketing and monitoring of social networks (186)
  3. Display and programmatic advertising (180)
  4. Marketing automation and campaign/lead management (161)
  5. Content Marketing (160)

What does this impressive computer graphics tell us? The number of providers that capture and manage our information and that revolve around online advertising.

I am impressed by the number: 6,242 UNIQUE MARKETING TECHNOLOGY PROVIDERS.

But, how is the process of collecting data to form a profile of the user through the analysis of their behavior (the first two layers of information) on the Internet?

Every time we visit a website, or every time we use an application, every “empty ad unit”, or every advertising unit, sends information to another organization called “Ad Exchange” that contains what we are seeing, or what we are looking for.

They also send your location by sending your zip code, or your GPS coordinates. This, every time you log in to a website or use an application. How many websites do you log in to per day? How many times do you use an application per day? These are the times your information is sent.

The whole industry is governed by two documents: The IAB, and the other, by Google.

Through these applications, very specific information is obtained to re-identify users and their every move on the Internet.

The New Economic Foundation estimates that in the UK the data of an average user is sent 164 times a day to any number of companies.

The interconnection of the actions we perform on the Internet (visiting email, search history, use of apps, visits to websites, use of an online service) links the data that are derived from them and, thus, a complete image of the user over time is created.

Here, metadata and Unique Identifiers play a very important role.

Metadata is detailed information about the data that helps to accurately describe its properties and the relationships of some data with others.

The California Consumer Privacy Act of 2018 contains a very comprehensive definition of IDs: “recognize a consumer, a family, or a device linked to a consumer or family, over time and through different services, including a device identifier; an Internet protocol address; cookies, pixel tags, mobile ad identifiers; customer number, unique pseudonym; phone numbers, or other forms of persistent or probabilistic identifiers that can be used to identify a particular consumer or device.

All this information, DAY BY DAY, completes our profile, segmented in databases feeding an AI system that will predict our tastes, our health, our love life, our sex life, our mental health, our political opinions, our religion…

This information is marketed and sold to large organizations that, in turn, will make decisions that seriously affect our lives. Information we cannot access. Information that is inferred, or assumed.

This is how people are segmented by profiles based on their behavior.

Who informed us of this Machiavellian plan? NOBODY.

Why don’t governments protect us? Because they are part of this Machiavellian plan.

It’s up to us to react.

As always, thank you very much for reading me.

Leave a Reply

Your email address will not be published. Required fields are marked *