(Press-News.org) Devavrat Shah's group at MIT's Laboratory for Information and Decision Systems (LIDS) specializes in analyzing how social networks process information. In 2012, the group demonstrated algorithms that could predict what topics would trend on Twitter up to five hours in advance; this year, they used the same framework to predict fluctuations in the prices of the online currency known as Bitcoin.
Next month, at the Conference on Neural Information Processing Systems, they'll present a paper that applies their model to the recommendation engines that are familiar from websites like Amazon and Netflix -- with surprising results.
"Our interest was, we have a nice model for understanding data-processing from social data," says Shah, the Jamieson Associate Professor of Electrical Engineering and Computer Science. "It makes sense in terms of how people make decisions, exhibit preferences, or take actions. So let's go and exploit it and design a better, simple, basic recommendation algorithm, and it will be something very different. But it turns out that under that model, the standard recommendation algorithm is the right thing to do."
The standard algorithm is known as "collaborative filtering." To get a sense of how it works, imagine a movie-streaming service that lets users rate movies they've seen. To generate recommendations specific to you, the algorithm would first assign the other users similarity scores based on the degree to which their ratings overlap with yours. Then, to predict your response to a particular movie, it would aggregate the ratings the movie received from other users, weighted according to similarity scores.
To simplify their analysis, Shah and his collaborators -- Guy Bresler, a postdoc in LIDS, and George Chen, a graduate student in MIT's Department of Electrical Engineering and Computer Science (EECS) who is co-advised by Shah and EECS associate professor Polina Golland -- assumed that the ratings system had two values, thumbs-up or thumbs-down. The taste of every user could thus be described, with perfect accuracy, by a string of ones and zeroes, where the position in the string corresponds to a particular movie and the number at that location indicates the rating.
Birds of a feather
The MIT researchers' model assumes that large groups of such strings can be clustered together, and that those clusters can be described probabilistically. Rather than ones and zeroes at each location in the string, a probabilistic cluster model would feature probabilities: an 80 percent chance that the members of the cluster will like movie "A," a 20 percent chance that they'll like movie "B," and so on.
The question is how many such clusters are required to characterize a population. If half the people who like "Die Hard" also like "Shakespeare in Love," but the other half hate it, then ideally, you'd like to split "Die Hard" fans into two clusters. Otherwise, you'd lose correlations between their preferences that could be predictively useful. On the other hand, the more clusters you have, the more ratings you need to determine which of them a given user belongs to. Reliable prediction from limited data becomes impossible.
In their new paper, the MIT researchers show that so long as the number of clusters required to describe the variation in a population is low, collaborative filtering yields nearly optimal predictions. But in practice, how low is that number?
To answer that question, the researchers examined data on 10 million users of a movie-streaming site and identified 200 who had rated the same 500 movies. They found that, in fact, just five clusters -- five probabilistic models -- were enough to account for most of the variation in the population.
Missing links
While the researchers' model corroborates the effectiveness of collaborative filtering, it also suggests ways to improve it. In general, the more information a collaborative-filtering algorithm has about users' preferences, the more accurate its predictions will be. But not all additional information is created equal. If a user likes "The Godfather," the information that he also likes "The Godfather: Part II" will probably have less predictive power than the information that he also likes "The Notebook."
Using their analytic framework, the LIDS researchers show how to select a small number of products that carry a disproportionate amount of information about users' tastes. If the service provider recommended those products to all its customers, then, based on the resulting ratings, it could much more efficiently sort them into probability clusters, which should improve the quality of its recommendations.
INFORMATION:
Written by Larry Hardesty, MIT News Office
Related links
Networks of probability
Predicting what topics will trend on Twitter END
A team of scientists specializing in cave biodiversity from the South China Agricultural University (Guangzhou) unearthed a treasure trove of rare blind cave beetles. The description of seven new species of underground Trechinae beetles, published in the open access journal ZooKeys, attests for the Du'an karst as the most diverse area for these cave dwellers in China.
"China is becoming more and more fascinating for those who study cave biodiversity, because it holds some of the most morphologically adapted cavernicolous animals in the world. This is specifically true ...
Athens, Ga. - Researchers at the University of Georgia have discovered that a chemical compound commonly found in coffee may help prevent some of the damaging effects of obesity.
In a paper published recently in Pharmaceutical Research, scientists found that chlorogenic acid, or CGA, significantly reduced insulin resistance and accumulation of fat in the livers of mice who were fed a high-fat diet.
"Previous studies have shown that coffee consumption may lower the risk for chronic diseases like Type 2 diabetes and cardiovascular disease," said Yongjie Ma, a postdoctoral ...
Today's herbaria, as well as all other collections-based environments, are now transitioning their collections data onto the web to remain viable in the smartphone-in-my-pocket age. A team of researchers have examined the importance of these online plant-based resources through the use of Google Analytics (GA) in a study that was published in the open access Biodiversity Data Journal (BDJ).
The amount of plant biodiversity resources freely accessible has exploded in the last decade, but validating an impact factor for these web-based works has remained difficult. A new ...
Philadelphia, PA (November 14, 2014) -- In a study of blacks with normal kidney function, those with severe periodontal disease developed chronic kidney disease (CKD) at 4 times the rate of those without severe periodontal disease. The study that will be presented at ASN Kidney Week 2014 November 11¬-16 at the Pennsylvania Convention Center in Philadelphia, PA.
Periodontal disease is a chronic bacterial infection of the oral cavity, and it disproportionately affects African Americans. It's also been implicated as a potential risk factor for CKD. To investigate this ...
Philadelphia, PA (November 14, 2014) -- Acute kidney injury occurs frequently in Ebola virus disease; however, providing hemodialysis to these patients was previously thought to be too risky because it involves large needles or catheters and potential contact with highly infectious blood. Clinicians recently accomplished the first known successful delivery of renal replacement therapy with subsequent recovery of kidney function in a patient with Ebola virus disease. Their protocol will be presented at ASN Kidney Week 2014 at the Pennsylvania Convention Center in Philadelphia, ...
Prostate cancer patients carrying inherited mutations in the BRCA genes respond less well to conventional treatment, including surgery and/or radiotherapy - and they also have a lower survival rate than those who are non-carriers of these genetic mutations. Data from the study, which has been published in the journal European Urology, points to the need for new clinical trials aimed at targeting these mutations in order to tailor treatment for these patients.
The study has been led by David Olmos and Elena Castro at the Spanish National Cancer Research Centre (CNIO) ...
November 14, 2014 - After the Boston Marathon bombings, more than 100 people were treated for trauma affecting the ears and hearing--with many having persistent or worsening hearing loss or other symptoms, reports a study in the December issue of Otology & Neurotology. The journal is published by Lippincott Williams & Wilkins, a part of Wolters Kluwer Health.
Dr. Aaron Remenschneider and principal investigator Dr. Alicia Quesnel of Massachusetts Eye and Ear Infirmary led a Boston-wide collaboration that reviewed the experience with otologic injuries caused by the 2013 ...
This news release is available in German. For their study, the scientists investigated diseased nerve cells using high precision methods and subsequently simulated their electrical properties on the computer. In their view, medical interventions that preserve the structural integrity of neurons may constitute an innovative strategy for the treatment of neurodegenerative diseases.
Inside the brain, the nerve cells, which are also called "neurons," are woven into a network in which they relay signals to one another. Thus, neurons form intricate projections that enable ...
This interview was conducted in August, released as a Web First, and appears in the November issue of Health Affairs.
Health Affairs has previously published Cheng's interviews with other world health ministers, including Thomas Zeltner of Switzerland (2010) and Chen Zhu of China (2012).
In this interview, Minister Nguyen noted that the Vietnamese parliament has voted to spend about 30 percent of the country's state fund for public health. However, that goal has yet to be reached. She also confirmed that Vietnam's 2008 Law of Health Insurance requires patients use ...
Managing childhood asthma is difficult. Rather than giving daily medications -- even when children feel well -- many parents treat asthma only when symptoms become severe. This practice can lead to missed school days, trips to the ER and hospitalizations.
But a novel program at Washington University School of Medicine in St. Louis suggests that peer trainers who coach parents over the phone on managing their children's asthma can sharply reduce the number of days the kids experience symptoms. The program also dramatically decreased ER visits and hospitalizations among ...