"I want people to be empowered by the data they create and not to be stifled by the data they create," says Andreas Weigend, one of the world's top experts on the future of big data, social mobile technologies, and consumer behavior. Learn more about this important issue, which affects us all.
JOANNE MYERS: Hello. This podcast is coming to you from the Carnegie Council in New York City.
I'm Joanne Myers, director of Public Affairs programs here at the Council. Today I'm in conversation with Andreas Weigend, who is the author of a fascinating new book, entitled Data for the People: How to Make our Post-Privacy Economy Work for You.
Andreas is one of the world's foremost experts on the future of big data, social mobile technologies, and consumer behavior. He teaches at Stanford University, University of California at Berkeley, Fudan University in Shanghai, and Tsinghua University in Beijing. He also runs the Social Data Lab, which is a group of data scientists and thought leaders that help students develop their data skills by solving real problems with expert coaching.
As society begins to reckon with the avalanche of data and the changes it will bring, we are delighted to have the opportunity to speak to Andreas, who is on the forefront of this revolution. He will provide many insights into the ways business is harnessing information and how we can make this work for us.
Andreas, thank you for joining us.
ANDREAS WEIGEND: Thank you for having me, Joanne.
JOANNE MYERS: In your book Data for the People, you tell the story of how social data has revolutionized the way a billion people make purchases, find information, and think about their identity. Yet, we personally do not benefit. Only large corporations seem to profit from this information.
But before we delve into the importance of taking personal control of this information, could you spend a few minutes and talk to us about what you mean when you use the term "social data"? And how is it different from social media?
ANDREAS WEIGEND: First, social data is data that you create and share. For example, my geolocation—I might share that with Google, for example, or I might be checking in on Facebook when I go to a restaurant. Now, that shows already the difference between social data, which is more general, and social media, which people tend to use more for work they publish on sites like Twitter or Facebook.
Another example of social data would be the clicks I make on Amazon. Those get combined—I sometimes say "refined"—by putting together my clicks with those of others in order to help me make better decisions in order to recommend items to me.
JOANNE MYERS: Why do you think it is important for citizens to understand more about their data trails?
ANDREAS WEIGEND: For two reasons. One, so that they understand what the impact is they leave for their own future. The other one is so that they can better leverage the data which they are creating anyway. I want people to be empowered by the data they create and not to be stifled by the data they create.
JOANNE MYERS: But is there any way for the consumer to know what data is being shared or what is being taken from them?
ANDREAS WEIGEND: It's interesting that you use the expression "being shared" but then "being taken from." "Sharing" is a metaphor I really agree with. "Taken from" is a bit weird, because, in contrast to physical goods—if I take an apple from you, then you don't have it anymore—for information or for data you still have the bytes, the bits, that you produced.
JOANNE MYERS: Is there anything that will protect us from the misuse, then, of this shared data?
ANDREAS WEIGEND: We always have a chance—any technology has a chance—to be used and to be misused. So the only thing that would protect you 100 percent is if you absolutely get off the grid. I am actually planning to go to North Korea later this year, and that should be an interesting example and an exercise in how that works, being truly off the grid for a few days.
A few years ago, a friend of mine here, David Holtzman, and I tried to virtually disappear for a week in San Francisco. It is very, very costly to try to not generate data. I think actually it is pretty much impossible. As soon as you leave your house, some security cameras will pick you up—so you can just stay home. But then, of course, you can't use the phone because if you make a phone call, that of course leaves metadata. So we can't be 100 percent protected. That's why my focus has shifted over the years from trying to protect my data to being very open with my data and trying to nourish the upside, nourish what I can get out of the data.
For example, on weigend.com/future you can find all the trips that I have planned in the future. You know even which seat I will be sitting in when I am flying back to California on Friday.
The hope I have is that we are able to influence organizations to build tools for us so our data can be refined and help us. For example, Google is a good example of that. If you share with Google what you are interested in, guess what? Google will give you some results. If you don't share with Google what you are interested in, then the best you can know is what other people maybe around the world are looking for right now.
JOANNE MYERS: So do you think your experiences growing up in Stasi Germany inform this notion that more information about a person is better than keeping facts away from the general public?
ANDREAS WEIGEND: I actually did not grow up in Stasi Germany. I grew up in the West. I was born in Freiburg, in the West, close to France and Switzerland. The Stasi story my book starts with is really the story about my dad. It's not really about my dad, but about his two files.
My dad was in prison in East Germany for a number of years because they thought he was an American spy. After he died—it was after 1989—I asked the government whether I could obtain a record of his Stasi file. They said, "No, we're sorry, but we don't have that anymore because we destroyed it to protect the informant. But since you asked, how about if we send you a copy of your Stasi file, or at least of the cover of your Stasi file?"
That was pretty amazing to me and pretty surprising that—I was in grad school when East Germany collapsed—I actually had a Stasi file about me. There wasn't very much in it. But the fact is that basically the government—and companies of course—collect so much data without us being able to do anything at the source.
Think about your telephone company, and of course think about Facebook and Google. Nowadays we reveal on Facebook what the, let's say, KGB, wouldn't have gotten out of us under torture. So, rather than trying to minimize the downside, I think our energy is better spent on leveraging the upside of the data we create.
JOANNE MYERS: That's very positive thinking I would say. But it sounds to me like the real issue is to ensure that the data companies are as transparent to us as we are to them. Do you have any suggestions about how we can move forward on this line of thinking?
ANDREAS WEIGEND: That one is, unfortunately, an illusion. We are always more transparent than a data refinery, which has a billion users, can possibly be to us. So what we should think about a little bit is, what can we demand in fact from the refineries in terms of transparency? There are two rights I think we should have as citizens, as users, towards the refineries in terms of transparency.
The first one is the right to see our own data, to access our own data. That is not just to see the bits and the bytes, but that means to have the refinery being able to provide us with tools so we can interpret our own data—for instance, by comparing them to others.
The second right is the right to inspect the refinery. It's a bit tricky, because we as individuals, of course, don't have the time and the ability to look through every single line of code. What I mean by "inspect the refinery" is getting access to the report that others write about the refinery. Those could be security reports; what do they do in terms of the software they run, do they run the latest updates, the latest security patches; do they run the two-factor authentication in how the employees access the system? But it also could be no reports by hackers, by white-hat hackers, by people who try to break in and break the system. When I say "inspecting the refinery," what I mean is getting access to those results.
JOANNE MYERS: Has any progress been made in this direction?
ANDREAS WEIGEND: I just want to draw a parallel which is familiar to all of us. When you buy a car or when you buy a refrigerator, they have those stickers on the fridge that tell you how energy efficient it is, and for a car how many miles per gallon. This does not mean that you have to run the test, but what it means is that your decision to buy this car or that car, this fridge or that fridge—or to use this refinery or that refinery—is supported by data. That's what I have in mind when I say our right to inspect the refinery.
JOANNE MYERS: Right. But personally, if we want access to our data, which you are proposing, has anything been codified in the law yet that would allow us to access this information?
ANDREAS WEIGEND: No, nothing has been codified yet. Obama had one of his [inaudible], that people need to have the right to access to the information.
I think it is really up to us as citizens to push for that right, to have it codified, just as it is up to us to push our government—this is my current DB [phonetic]—towards transparency and towards if companies, for instance, break the laws, if they sell our data illegally, to have the same things kick in as if somebody steals your bicycle.
JOANNE MYERS: I think these are all wonderful proposals. But in this world of alternate facts and alternate reality, it might be a little bit difficult to enforce them or make much progress.
But just shifting for a moment, in reading your bio what I found to be most intriguing was in discovering that you were the chief scientist at Amazon and worked with Jeff Bezos to develop the company's data strategy and customer-centered culture. Could you tell us a little bit about your work there and how you created the data tools that fundamentally changed how people decide what to purchase? After all, I think you were there. You have become the gold standard for what e-commerce is all about.
ANDREAS WEIGEND: Amazon, similar to Google and other good companies, is a very data-driven culture. Now, the term "data-driven" is tricky, because I don't want to be driven by data; I actually want to drive data. What that means is I want to be able to run experiments.
In the past, before e-commerce, before the Internet basically, the feedback time between an idea and knowing whether the idea works or doesn't work was in many cases months. On the web, where you have a billion people as, if you will, subjects, a billion people use your services, if you change something, you can get the answer to your hypothesis not within months but within minutes.
That idea of having a scientist—in my case a physicist, an experimental physicist—who knows how to set up experiments, how to collect data, and then also in my case—I have a Ph.D. from Stanford in neural networks—knows how to analyze it and to build systems, that very much matched with Jeff Bezos's vision of how he wanted the company to evolve and the company to innovate.
JOANNE MYERS: So you are the person who when I purchase a book on Amazon and it says "Customers who bought this book would also like that book," you are the person responsible?
ANDREAS WEIGEND: No. There are many, many people who had that idea at the same time.
JOANNE MYERS: You're quite modest.
ANDREAS WEIGEND: No. That's just the way it is. Whether it is the recommender's idea, which was pervasive in the 1990s—I for instance, before Amazon, created a start-up called MoodLogic, where we did recommendations for music, similar to what you see now in Pandora or in Spotify. That idea was an idea which was very, very natural to many people in the 1990s through the idea of neural networks, which was really natural to many people in the 1980s. I think every time, given the data that they have and even the computer power they have, smart people come up with similar ideas that leverage data and leverage computational resources.
JOANNE MYERS: Do you think this altered the way we view ourselves, interact with each other, and make decisions in some way?
ANDREAS WEIGEND: That is actually a very good question. I'm traveling right now, so I'm here in DC. It is a very natural thing for me to see what Facebook friends are nearby. That feature is one of the typical give-to-get features. I can only use the feature if I share with Facebook my geolocation. In exchange for that, it tells me, for instance, that a couple of my former students were right next to where I was last night for dinner. So I met up with them.
I think the give-to-get idea should be complemented with actually being able to control the granularity of our data. Just as we have the right to access our data and the right to inspect refineries, I think one right we need to have is to blur the data, to set the granularity or to set the resolution of the data that we share.
What I am very passionate about, both in teaching and in the book, is that I want people to understand the tradeoffs they necessarily need to make. For example, if you set your granularity to very coarse, let's say only on the C level, then if you order a pizza, how would that poor delivery person know where to drop the pizza off? That is, of course, a trivial example.
But that idea that you are in charge of the granularity of your data, but please be consistent in what you are requiring. So don't go into a taxi and when the driver asks you "Where can I take you?" say, "Oh, I'm not telling; that would be an invasion of my privacy." There again it's trivial. But you need to share to a level of accuracy. You can say, "A block away from that," just in case you don't want him to know where exactly you are going.
JOANNE MYERS: The last question I would like to ask you is: What forces or policy in the news now are putting our data at risk?
ANDREAS WEIGEND: That is a very broad question. I'm not sure whether I should answer it with the last 24 hours or the last three days or the last week.
JOANNE MYERS: That is challenging.
ANDREAS WEIGEND: It is challenging.
Just taking up two examples which I personally found incredible is the line they take in terms of immigration of trying to get back to people's social media. I think it's absolutely ridiculous how we can make that a decision criterion of determining whether we let somebody enter the country or not. If you are a terrorist, then you are trained well enough to not use your real account; you use some fake account for that. On the other hand, if you are an innocent person who maybe has a typo when they write down their Facebook account, then you are suddenly lying on your form. As the amount of data we create increases exponentially, so does the amount of fake data. It is just a no-brainer, stupid idea to use that for immigration purposes.
I should say it is also a very sad sign for the United States if they don't have search engines good enough to figure out for that person who gives them their passport information who is about to enter the United States what their social media presence really is. That is one thing which I just think is bad, that we have people who think that that's the solution to something.
The other thing which I found outrageous, of course, is the gag to scientists. I mean how is it possible that you fund research and then don't try to encourage people in whatever media they want to talk about the results? We should not try to control communication channels of scientists. We should make it as easy as possible to share the results.
That is another right I have, actually, the right to amend data. Everybody should be able to amend any piece of science or otherwise which they see and we should use the machinery which is built. You mentioned Amazon recommendations. We can also talk about Google search ranking. We should use that machinery for figuring out what is the truth and not have some administrator determine what is fit for the press to know about.
JOANNE MYERS: Agreed, agreed, agreed.
On that note, I will thank you for talking to us about the rapid changes we can expect in data gathering regarding ownership and protection. It was very enlightening. I thank you very much, Andreas.
ANDREAS WEIGEND: Thank you for your time.