REPORTER’S NOTEBOOK: Chief Google economist talks on data, statistics, and Google tools
Lecture focuses on using Google Trends, Correlate, Surveys
Yesterday afternoon, the Undergraduate Economics Association hosted a lecture by Google’s Chief Economist, Hal R. Varian ’69, on “Predicting the Present with Search Engine Data.”
Hal Varian, an MIT alum, taught at UC Berkeley for several decades before becoming Google’s Chief Economist in 2010. He is one of the several influential voices in the emerging field of big data, particularly noted for saying in The McKinsey Quarterly that being a statistician would be “the sexy job” in the next decade. I thought attending the lecture would be a great opportunity for me as an economics student.
I arrived at the lecture hall about 10 minutes early, not expecting to see a giant crowd. However, by the time 4:30 rolled around, E51-345 had filled its capacity of 128 with standing room only. Although billed as an undergraduate event, it was clear that most attendees were graduate students eager to learn more about applying “big data” to business. The lecture began with quick remarks from UEA President, Ting Mao ’14 and a glowing introduction of Varian by economics lecturer Sara F. Ellison, who credited him for inspiring a new generation of information economists.
Varian’s lecture focused on three Google Tools: Trends, Correlate, and Consumer Surveys. He began with light-hearted set of questions: “What day in the week receives the most searches about hangovers?” Apparently, searches about hangovers peak every Sunday with an outlier on Jan. 1, eliciting guilty chuckles from the audience. This laughter was even more pronounced when Varian highlighted that searches about vodka peak a day before searches about hangovers. He presented other statistics, including the peak in the search term “civil war” which generally peaks “three days before the term paper is due.”
Once the audience was thoroughly entertained, he transitioned to a more practical application of Google Tools. Varian showed that queries about unemployment claims are a good indicator of the unemployment rate and when a recession begins and ends. Google’s large data set from searches allows people to build better predictive models that take into account the relationships between different variables. With a linear model, it is “hard to catch the turning point,” but with Google’s detailed search data, more accurate regressions can be drawn.
In addition, Varian highlighted how easy it is to collect data with Google Consumer Surveys, noting that he started a consumer survey before dinner about the minimum wage, and after dessert, there were about 700 responses. He claimed that the next best alternative to running an online survey would be roughly 40 times more expensive. In addition, with Google’s survey tools, one can see how word changes in the phrasing of questions affect responses. With these data easily accessible, it “democratizes the whole profession” and has large implications for both businesses and social sciences.
My favorite part of his lecture was when he briefly touched upon how the consumer sentiment, a survey very helpful to economists during the past recession, could be better interpreted. Varian said, “As economists, we don’t quite know what the best correlates will be. It’s not obvious.” Indeed, the consumer sentiment survey contained “fat data” which has many predictors but few observations. Varian showed how Google’s private data on queries related to financial planning, investing, business news, utilities, and search engines helped make more sense of the raw data of consumer sentiment. It was amazing to see the regression line on each successive lecture slide became better fitted towards the data points as Google’s search data was added. With more data, better predictive models can be built.
After 40 minutes speaking, Varian wrapped up his presentation by concluding that “the challenge that is facing the economics profession is how do we combine public and private data in a useful way.” Needless to say, most of the attendees were sold on his presentation about big data.
For the last 20 minutes, Varian took questions. Most of the questions that were raised concerned the possibilities of big data and the “how” behind Google’s business. Some even inquired about the prices that Google charges to use its tools. I eventually got up the courage to ask whether or not there was a particular set of data he particularly enjoyed researching. Varian simply couldn’t decide, stating “there’s a lot of things you can look at that are both interesting and instructive.” He suggested that he thought looking at trends across countries was a more intriguing topic.
The lecture was a great opportunity for students interested in economics, information, and business. His presentation highlighted the bright future of big data and illuminated what it means to be better at forecasting the future.