Published 05 March 2021
Last night, I participated in another webinar on data. Multiple speakers used the term “data is the new oil.” It is not. I thought it might be useful to republish this piece from April 2019 on why data should not be compared to anything, because faulty analogies lead to flawed policy responses. Data is data. Not oil. Not, as the piece states, avocados or pears, nor oxygen or fuel. Nearly two years after my original post, it is long past time to get this right.
Repost: Data is the New Avocado?
At the UNCTAD E-Commerce Week in Geneva, Theis Søndergaard from Vivino made a brilliant observation. He noted that, while analogies can be extremely useful in trying to understand new things, the wrong analogy can be quite dangerous. It can lead, in fact, to exactly the wrong types of regulations or laws.
He was specifically referring to a saying that “data is the new oil.” Such a description has become commonplace, especially among government officials.
There is a reason why this description resonates. Oil helps lubricate the economy. Data, in a digital world, does something similar.
Oil needs to be processed. On its own, oil has little utility. Data, in form of raw bits and pieces of information, has limited use.
But as Søndergaard suggested, data is not like oil. For one thing, oil doesn’t go anywhere. It sits in the ground until it is brought up and used. It can be used all at once or just some at a time while the rest remains waiting.
Oil can be stored forever (or at least for a very long time) without significant problems.
Data, by contrast, is like an avocado. It has a clearly defined shelf-life. Data collected and used too early is pointless. Data harvested too late is often of no use at all.
Søndergaard’s company runs what is billed as the world’s largest online wine marketplace. The app allows more than 35 million users to rate wine products.
In his business, it does no good at all to rate a wine that does not exist as all the stock is gone or to recommend a wine to a customer that has already purchased something to drink for dinner. What matters is knowing what is needed in the moment when the information is “ripe.”
It is more like an avocado than anything resembling oil.
[If your avocado experience suggests the window for ripeness is actually rather long, perhaps data is new pear? Eddie Izzard, a British comedian, has an extremely funny 11 second, totally family inappropriate, video clip on the microscopic window for a pear’s ripeness. Click here for a good laugh.]
The avocado or pear analogy doesn’t really work either though. Once you eat the avocado or pear, it is gone. While some types of data might fit this profile—a “one-time use” category like a customer buying a special bottle of wine for a 50th wedding anniversary—most occasions are probably not quite so special.
With an avocado or pear, if you own it, I cannot. I suppose we could come to some arrangement for half an avocado or a slice of an avocado for my toast, but this quickly gets too complicated for most. Your avocado is your own.
Data is, again, generally not quite like this.
It is also important to be more precise about what is “data.” Increasingly the term is tossed around to mean anything. If everything qualifies as data, then there is little utility in the term at all.
In some ways, the opposite problem also applies to data. In a quest to define the term carefully, it is easy to get too precise and slice “data” into such narrow categories that the term, again, could lose all useful meaning.
Consider the situation of Vivino (with the caveat at the outset that I have not had any extended discussions with the company at all and may be grossly misstating their business).
This is a tiny firm of less than a dozen employees, using a wide variety of “data” collected from all around the world. The data contained in the firm includes personal data on individuals active on the site and app, as well as companies that must surely use the site for buying and selling. The company also holds financial data.
To be effective, attract 35 million customers and keep growing, Vivino has to be using the latest technology to best match the preferences of individuals to past recommendations of others. The algorithms need to sift through millions of data points to arrive at selections that will keep customers coming back for more.
This small firm has to contend with protecting all these types of information and ensuring no data breaches. It has to navigate data flows across many different countries and deliver superior service in each market.
The firm likely can’t split off personal data from financial data. If it can, it will be difficult and expensive and it is probably unclear why such a distinction is necessary—a customer who wants wine is a customer that wants wine, whether it is an individual or a hotel chain.
The firm can’t be a “processor” of data one minute and a “controller” of data the next. There are, recall, about a dozen people working in the company.
Søndergaard may suggest that data is the new avocado, but really, data is not a good match for anything that currently exists. Rather than stretch for analogies that do not fit well and lead officials to regulate data poorly, it is perhaps time to stop discussing analogies at all.
Data is just data. It is not “like” anything at all.
It is, instead, the engine of growth for small firms like Vivino and millions of others around the world. Data drives productivity gains. It empowers consumers. It helps big firms connect to their teams and suppliers.
Data should be regulated carefully, with consideration for what it does in various situations for different types of firms and for consumers. Trying to shove “data” into existing mental models is likely to do more harm than good. It makes it easier to ignore the unintended consequences of various types of regulations.
The only way to really understand what the digital economy and data looks like today is to go out and ask firms of different sizes and in the widest possible variety of businesses what, exactly, they do with information. What sort of information do they collect? Why? Why do they need this information? How might proposed regulatory changes affect their current business models?
Absent good intelligence on how data is actually used today and without foresight on what it might be doing tomorrow, we run the risk of creating the opposite outcomes to what was intended. This would leave all of us with fewer avocados, pears, wine, the critical services and more needed to deliver them in the future.
© The Hinrich Foundation. See our website Terms and conditions for our copyright and reprint policy. All statements of fact and the views, conclusions and recommendations expressed in this publication are the sole responsibility of the author(s).