Leaving a Digital Trail: What About Privacy?

By John Markoff Dec. 2, 2008

Harrison R. Brown ’12, an 18-year-old freshman majoring in mathematics at MIT, didn’t need to do complex calculations to figure out he liked this deal: In exchange for letting researchers track his every move, he receives a free smartphone.

Now, when he dials another student, researchers know. When he sends an e-mail or text message, they also know. When he listens to music, they know the song. Every moment he has his Windows Mobile smartphone with him, they know where he is, and who’s nearby.

Brown and about 100 other students living in Random Hall at MIT have agreed to swap their privacy for smartphones that generate digital trails to be beamed to a central computer. Beyond individual actions, the devices capture a moving picture of the dorm’s social network.

The students’ data is but a bubble in a vast sea of digital information being recorded by an ever thicker web of sensors, from phones to GPS units to the tags in office ID badges, that capture our movements and interactions. Coupled with information already gathered from sources like Web surfing and credit cards, the data is the basis for an emerging field called collective intelligence.

Propelled by new technologies and the Internet’s steady incursion into every nook and cranny of life, collective intelligence offers powerful capabilities, from improving the efficiency of advertising to giving community groups new ways to organize.

But even its practitioners acknowledge that, if misused, collective intelligence tools could create an Orwellian future on a level Big Brother could only dream of.

Collective intelligence could make it possible for insurance companies, for example, to use behavioral data to covertly identify people suffering from a particular disease and deny them insurance coverage. Similarly, the government or law enforcement agencies could identify members of a protest group by tracking social networks revealed by the new technology. “There are so many uses for this technology — from marketing to war-fighting — that I can’t imagine it not pervading our lives in just the next few years,” says Steve Steinberg, a computer scientist who works for an investment firm in New York.

In a widely read Web posting, he argued that there were significant chances that it would be misused: “This is one of the most significant technology trends I have seen in years; it may also be one of the most pernicious.”

For the last 50 years, Americans have worried about the privacy of the individual in the computer age. But new technologies have become so powerful that protecting individual privacy may no longer be the only issue. Now, with the Internet, wireless sensors and the capability to analyze an avalanche of data, a person’s profile can be drawn without monitoring him or her directly.

“Some have argued that with new technology there is a diminished expectation of privacy,” said Marc Rotenberg, executive director of the Electronic Privacy Information Center, a privacy rights group in Washington. “But the opposite may also be true. New techniques may require us to expand our understanding of privacy and to address the impact that data collection has on groups of individuals and not simply a single person.”

Brown, for one, isn’t concerned about losing his privacy. The MIT researchers have convinced him that they have gone to great lengths to protect any information generated by the experiment that would reveal his identity.

Besides, he says, “the way I see it, we all have Facebook pages, we all have e-mail and Web sites and blogs.”

“This is a drop in the bucket in terms of privacy,” he adds.

Google and its vast farm of more than a million search engine servers spread around the globe remain the best example of the power and wealth-building potential of collective intelligence. Google’s fabled PageRank algorithm, which was originally responsible for the quality of Google’s search results, drew its precision from the inherent wisdom in the billions of individual Web links that people create.

The company introduced a speech-recognition service in early November, initially for the Apple iPhone, that gains its accuracy in large part from a statistical model built from several trillion search terms that its users have entered in the last decade. In the future, Google will take advantage of spoken queries to predict even more accurately the questions its users will ask.

And, a few weeks ago, Google deployed an early-warning service for spotting flu trends, based on search queries for flu-related symptoms.

The success of Google, along with the rapid spread of the wireless Internet and sensors — like location trackers in cell phones and GPS units in cars — has touched off a race to cash in on collective intelligence technologies.

In 2006, Sense Networks, based in New York, proved that there was a wealth of useful information hidden in a digital archive of GPS data generated by tens of thousands of taxi rides in San Francisco. It could see, for example, that people who worked in the city’s financial district would tend to go to work early when the market was booming, but later when it was down.

It also noticed that middle-income people — as determined by ZIP code data — tended to order cabs more often just before market downturns.

Sense has developed two applications, one for consumers to use on smartphones like the BlackBerry and the iPhone, and the other for companies interested in forecasting social trends and financial behavior. The consumer application, Citysense, identifies entertainment hot spots in a city. It connects information from Yelp and Google about nightclubs and music clubs with data generated by tracking locations of anonymous cell phone users.

The second application, Macrosense, is intended to give businesses insight into human activities. It uses a vast database that merges GPS, Wi-Fi positioning, cell-tower triangulation, radio frequency identification chips and other sensors.

“There is a whole new set of metrics that no one has ever measured,” said Greg Skibiski, chief executive of Sense. “We were able to look at people moving around stores” and other locations. Such travel patterns, coupled with data on incomes, can give retailers early insights into sales levels and who is shopping at competitors’ stores.

Alex P. Pentland PhD ’82, a professor at the Media Lab at the Massachusetts Institute of Technology who is leading the dormitory research project, was a co-founder of Sense Networks. He is part of a new generation of researchers who have relatively effortless access to data that in the past was either painstakingly assembled by hand or acquired from questionnaires or interviews that relied on the memories and honesty of the subjects.

The Media Lab researchers have worked with Hitachi Data Systems, the Japanese technology company, to use some of the lab’s technologies to improve businesses’ efficiency. For example, by equipping employees with sensor badges that generate the same kinds of data provided by the students’ smartphones, the researchers determined that face-to-face communication was far more important to an organization’s work than was generally believed.

Productivity improved 30 percent with an incremental increase in face-to-face communication, Pentland said. The results were so promising that Hitachi has established a consulting business that overhauls organizations via the researchers’ techniques.

Pentland calls his research “reality mining” to differentiate it from an earlier generation of data mining conducted through more traditional methods.

Pentland “is the emperor of networked sensor research,” said Michael Macy, a sociologist at Cornell who studies communications networks and their role as social networks. People and organizations, he said, are increasingly choosing to interact with one another through digital means that record traces of those interactions. “This allows scientists to study those interactions in ways that five years ago we never would have thought we could do,” he said.

Once based on networked personal computers, collective intelligence systems are increasingly being created to leverage wireless networks of digital sensors and smartphones. In one application, groups of scientists and political and environmental activists are developing “participatory sensing” networks.

At the Center for Embedded Networked Sensing at the University of California, Los Angeles, for example, researchers are developing a Web service they call a Personal Environmental Impact Report to build a community map of air quality in Los Angeles. It is intended to let people assess how their activities affect the environment and to make decisions about their health. Users may decide to change their jogging route, or run at a different time of day, depending on air quality at the time.

“Our mantra is to make it possible to observe what was previously unobservable,” said Deborah Estrin, director of the center and a computer scientist at UCLA.

But Estrin said the project still faced a host of challenges, both with the accuracy of tiny sensors and with the researchers’ ability to be certain that personal information remains private. She is skeptical about technical efforts to obscure the identity of individual contributors to databases of information collected by network sensors.

Attempts to blur the identity of individuals have only a limited capability, she said. The researchers encrypt the data to protect against identifying particular people, but that has limits.

“Even though we are protecting the information, it is still subject to subpoena and subject to bullying bosses or spouses,” she said.

She says that there may still be ways to protect privacy. “I can imagine a system where the data will disappear,” she said.

Already, activist groups have seized on the technology to improve the effectiveness of their organizing. A service called MobileActive helps nonprofit organizations around the world use mobile phones to harness the expertise and the energy of their participants, by sending out action alerts, for instance.

Pachube (pronounced “PATCH-bay”) is a Web service that lets people share real-time sensor data from anywhere in the world. With Pachube, one can combine and display sensor data, from the cost of energy in one location, to temperature and pollution monitoring, to data flowing from a buoy off the coast of Charleston, S.C., all creating an information-laden snapshot of the world.

Such a complete and constantly updated picture will undoubtedly redefine traditional notions of privacy.

Dr. Pentland says there are ways to avoid surveillance-society pitfalls that lurk in the technology. For the commercial use of such information, he has proposed a set of principles derived from English common law to guarantee that people have ownership rights to data about their behavior. The idea revolves around three principles: that you have a right to possess your own data, that you control the data that is collected about you, and that you can destroy, remove or redeploy your data as you wish.

At the same time, he argued that individual privacy rights must also be weighed against the public good.

Citing the epidemic involving severe acute respiratory syndrome, or SARS, in recent years, he said technology would have helped health officials watch the movement of infected people as it happened, providing an opportunity to limit the spread of the disease.

“If I could have looked at the cell phone records, it could have been stopped that morning rather than a couple of weeks later,” he said. “I’m sorry, that trumps minute concerns about privacy.”

Indeed, some collective-intelligence researchers argue that strong concerns about privacy rights are a relatively recent phenomenon in human history.

“The new information tools symbolized by the Internet are radically changing the possibility of how we can organize large-scale human efforts,” said Thomas W. Malone, director of the MIT Center for Collective Intelligence.

“For most of human history, people have lived in small tribes where everything they did was known by everyone they knew,” Malone said. “In some sense we’re becoming a global village. Privacy may turn out to have become an anomaly.”