The volume of information created at any given moment, as we have previously discussed, continues to grow at a fast pace — and everything points to an ever-increasing speed. In the time it takes for you to read this article, more than 10 million searches will have been made with Google, about a million new tweets will have been published and almost one thousand hours of video will have been uploaded to YouTube. According to IBM, every single day more than 2.5 quintillions of bytes are produced — the equivalent of 2.5 million terabytes. As we have mentioned here, with “just” 10 terabytes (or 10,240 gigabytes), it is possible to store the contents of every book in the United States’ Library of Congress, the largest in the world.
The financial opportunity stored in the information generated by the multitude of business segments is measured in the hundreds of billions of dollars, and virtually all sectors — private and public — that produce data can benefit from the analysis performed by business intelligence systems.
Recommendations you receive when shopping at Amazon would not be possible without the use of techniques for analyzing extraordinary amounts of data. Computer systems cross information from your history with the histories of millions of other consumers, taking into account issues such as demographics, seasons, location, and previous searches. Something similar happens when you choose to watch content on Netflix: the platform suggests movies, series, and documentaries that are likely to interest you based on your choices, notes, and history using the tool.
But on Christmas Eve 2013, Amazon took things a step further. With patent number US 8,615,473, the company registered Method and system for anticipatory package shipping, indicating its plans to ship consumer goods to its clients before they even place their orders. The use of data combined with smart algorithms enables the company to predict, with some degree of success, what a consumer’s next order will be.
Google also uses the collection of searches performed globally to attempt to “guess” what we’re all searching for: Using the first few letters of your search, the algorithms look through the most frequent searches performed, and they auto-suggest the question you may be asking. (Google tries to avoid suggestions with negative or defamatory content, although this is not always possible.) The data provided by the users themselves are a powerful asset for Google, and that fact, to a large extent, makes it one of the world’s most valuable companies, with a market value of about one trillion dollars in late 2020.
In November 2008 scientists at Google partnered with the US CDC (Centers for Disease Control and Prevention) to publish an article in Nature, entitled “Detecting influenza epidemics using search engine query data”. The work compared historical information on epidemics against around 50 million words (both related and unrelated to influenza) appearing most often in the more than three billion searches carried out by users every day. Then, after testing nearly half a billion mathematical models that correlated the searches to epidemics, the system identified the 45 terms that best fit the data. Thus, based on the information provided by the users through their searches, Google gained the ability to identify, with a high degree of precision, where a flu epidemic was occurring.
In April 2020, during the COVID-19 pandemic, Google and Apple partnered to develop a contact tracing app to anonymously keep track of nearby phones. If one of the owners on this list of phones was diagnosed with the virus, alerts would be sent to people who had been nearby.
All these examples speak to the importance of data in our society — and as every valuable asset, it has to be protected and used wisely. Since the beginning of the 2010’s, countries have been writing legislation on the matter, and arguably the most famous piece is the General Data Protection Regulation (GDPR). Written in 2016 by the European Parliament and by the Council of the European Union, its fundamental objective is to make sure individuals in the European Union and in the European Economic Area have control over their personal data.
The balance between making sure our data is serving us — which may only be achieved by sharing it — while at the same time preserving our rights to privacy is a complex issue, but an inevitable discussion in every instance of society around the world.