Tell me what you do, I'll tell you who you are

16

May 2019

6

min of reading

By shopping online or posting a "like" on Facebook we expose our identity and privacy to a gigantic pool of data. Yes, there is huge potential there for humanity. But there are also a number of threats we need to be aware of. Second in a series of ten texts about the risks of the "Digital Revolution".

- When it starts working, it will no longer be free.

- Bound, yes, by an obligation to the order he himself has instituted, and therefore free.

- For yes, the dialectic of freedom is unfathomable.

Thomas Mann Doctor Faustus

In 2006, Google bought YouTube for a whopping $1.65 billion. At the time, this was considered excessive: the online video platform was hardly worth that much [1]. What quickly became apparent was that Google's interest was not necessarily in the content, but in what it said about users. The big tech companies - Google, Facebook, Microsoft, etc. - simultaneously have access to data from platforms as diverse as social media, video sharing and direct messaging. Google, in addition to the products associated with its name (Chrome, Gmail, Photos, Maps), and YouTube, also has reCaptcha, which forces us to prove to bots that we are not bots. Those who avoid Facebook, but use Instagram or WhatsApp, may not know that they belong to the same company. Microsoft, in addition to Skype, owns LinkedIn and Hotmail. All these companies relentlessly buy countless others, which collect user data or develop Artificial Intelligence technologies. They thus start offering products that recognise our voice (like Siri or Alexa), identify who is who in a photo and predict what we will want to buy next week.

Why so much investment in platforms we don't pay for [2]?

The business model is the monetisation of our data, to be used in targeted advertising or recommendation systems, using machine learning algorithms. Let's think about the supermarket points card that you possibly have in your wallet. Why are you "offering" discounts to those who use it? In fact, companies are buying your data, to understand buying patterns. With well-trained algorithms, they can do better stock management, or gain insight and influence over your choices.

These algorithms work in a conceptually simple way:

1. they analyse thousands (or millions) of profiles of those who shop in their supermarkets every day;

2. they discover "shopping patterns" - those who buy size 0 nappies in January will buy first porridge from April;

3. they try to identify similarities and predict behaviour, "knowing" not only who will buy baby food but also which brands they will prefer.

And these algorithms are getting better and better. It is already possible to extract information and patterns from both structured information (e.g. supermarket shopping) and free text, images and videos. By simply showing the algorithm images with the classification "cats" or "non-cats", it learns to distinguish cats. And who has classified these images that train the algorithms? Millions of humans who share photos of their pets, "tag" their friends, describe holidays. We all know our data is being used, but we may not know the level to which this happens or the problems it can raise.

First, and although these processes imply consent, sharing is not always voluntary and we will talk more about this next week. Even when it is voluntary, it can have unexpected implications and data shared in one context can be leveraged in unpredictable ways. Two examples: in 2012, the New York Times reported a case in which, precisely through the analysis of shopping profiles,a shop chain in the US generated predictions about the pregnancy of its customers [3]. One of these shops was visited by a father outraged that his teenage daughter had received promotions for baby clothes and cots; later, when the shop manager called to apologise, the gentleman replied in shame that his daughter was indeed pregnant. The supermarket chain knew before the family itself. And data companies, like HiQ, use information they find on the web to predict when a worker is about to quit, and sell that information to the employer.

Even when one is careful about what one publishes online, the care itself can be informative, as Shoshana Zuboff summarises in The Age of Surveillance Capitalism [4]:

"It's not what's in the sentences, but their length and complexity (...) not where you make plans with friends, but how you do it: a casual "see you later" or a precise place and time? Exclamation marks and adverb choices act as telling, and potentially damaging, signals."

Naturally, this level of control has social consequences, which can be profound: from data mining to behaviour prediction for commercial and political purposes, technological giants increasingly reduce uncertainty, and the next step is to control actions. Games, subtle stimuli, content designed to generate a response; reduction of free will in the name of security, efficiency and profit - we will return to these themes in other texts in this series.

This tension between freedom of choice and external imposition is not new. It is constant in history and increasingly present in our interaction with modern technology. We cede control of privacy and decisions in exchange for convenience and productivity: when we search for a book on Amazon and receive recommendations based on our profile and history; when we search on Google, and shortly after see related ads on Facebook at the moment they predict we are most receptive.

It is important to make it clear that we are not condemned to the false dichotomy between the good savage and the oppression of the permanent raid. There are technological alternatives, such as distributed and federated systems, under one's control, with respect for autonomy and privacy. The Mastodon social network, with millions of users, is perhaps the best example, but many others are taking shape [5]. Nevertheless, the issue is above all political, and the whole of society should be called upon to decide, democratically and properly informed, how to regulate this sector. That will be the subject of the next article.

**10 months and 10 articles from the Data Science and Policy research group of Nova SBE**

Over the next few months we will describe and discuss some of the possible dark sides of this revolution. We start by explaining the so-called recommender systems (or what the supermarket points are for) and discuss how current legislation does (not) protect us. In the following months, we will understand whether or not we should cover the phone's camera, the risks of health apps and how to identify fake news. We expect to offer information and practical indications, but also address issues of principle and values that help us think about the world not as it exists today, but as we would like it to be. Because the future is decided now.

Get the best insights for your business.

Subscribe to our newsletter

Published in

16/5/2019

in the area of

AI, Data and Digital