Using a database of 130,000 Yelp reviews, Smith PhD student Jorge Mejia and two Smith professors have found a way to predict which Washington, D.C., restaurants will close. The technique, which grew out of Mejia’s dissertation, involves new software that can “read” and analyze the contents of online reviews.
For the study, the researchers identified slightly more than 2,000 regional restaurants that were open as of December 2013. From various sources, they then identified roughly 450 that had closed from 2005 to 2014. To identify linguistic patterns that foretold closure, they paired restaurants according to such factors as price and cuisine type, and looked at how the descriptions varied.
It doesn’t take a PhD to know that there’s a connection between a restaurant’s Yelp rating and whether it will survive. But what Mejia and professors Shawn Mankad and Anandasivam Gopal have created is more powerful. Their computer-assisted text analysis is more accurate at predicting restaurants’ demise than ratings alone (although the tool is most powerful when used in combination with numerical ratings).
Other scholars have sought to take the emotional temperature of online reviews by analyzing the proportion of positive versus negative words. But the approach developed at Smith goes deeper, examining constellations of words associated with restaurants’ beating the long odds of their industry and remaining open.
For instance, restaurants for which reviewers used the words “food,” “good,” “place,” “like,” “order,” “friend,” “time,” “great,” “nice” and “service” tended to survive at unusually high rates. The Smith team called the variable linked to those words “Quality_Overall,” and it seemed to be the most potent signifier of general quality.
They used one subset of data to uncover the relevant linguistic patterns and another subset to test the predictive power of their model. In that second group, the variables did predict, to a statistically significant degree, whether a restaurant closed.
Although their predictive powers haven’t been tested in the real world, the algorithms and models used could be of great use to restaurant operators, the authors say. “I always try to find interesting research questions that would solve real problems,” says Mejia, who came to the University of Maryland with more than three years of management consulting experience in Europe and the United States.
In 2012 he also participated in Startup Chile, a six-month $45,000 budget accelerator program backed by the Chilean government.
The working paper is “More Than Just Words: Using Latent Semantic Analysis in Online Reviews to Explain Restaurant Closures.”