(Press-News.org) CAMBRIDGE, Mass. -- Twitter's home page features a regularly updated list of topics that are "trending," meaning that tweets about them have suddenly exploded in volume. A position on the list is highly coveted as a source of free publicity, but the selection of topics is automatic, based on a proprietary algorithm that factors in both the number of tweets and recent increases in that number.
At the Interdisciplinary Workshop on Information and Decision in Social Networks at MIT in November, Associate Professor Devavrat Shah and his student, Stanislav Nikolov, will present a new algorithm that can, with 95 percent accuracy, predict which topics will trend an average of an hour and a half before Twitter's algorithm puts them on the list — and sometimes as much as four or five hours before.
The algorithm could be of great interest to Twitter, which could charge a premium for ads linked to popular topics, but it also represents a new approach to statistical analysis that could, in theory, apply to any quantity that varies over time: the duration of a bus ride, ticket sales for films, maybe even stock prices.
Like all machine-learning algorithms, Shah and Nikolov's needs to be "trained": it combs through data in a sample set — in this case, data about topics that previously did and did not trend — and tries to find meaningful patterns. What distinguishes it is that it's nonparametric, meaning that it makes no assumptions about the shape of patterns.
Let the data decide
In the standard approach to machine learning, Shah explains, researchers would posit a "model" — a general hypothesis about the shape of the pattern whose specifics need to be inferred. "You'd say, 'Series of trending things … remain small for some time and then there is a step,'" says Shah, the Jamieson Career Development Associate Professor in the Department of Electrical Engineering and Computer Science. "This is a very simplistic model. Now, based on the data, you try to train for when the jump happens, and how much of a jump happens.
"The problem with this is, I don't know that things that trend have a step function," Shah explains. "There are a thousand things that could happen." So instead, he says, he and Nikolov "just let the data decide."
In particular, their algorithm compares changes over time in the number of tweets about each new topic to the changes over time of every sample in the training set. Samples whose statistics resemble those of the new topic are given more weight in predicting whether the new topic will trend or not. In effect, Shah explains, each sample "votes" on whether the new topic will trend, but some samples' votes count more than others'. The weighted votes are then combined, giving a probabilistic estimate of the likelihood that the new topic will trend.
In Shah and Nikolov's experiments, the training set consisted of data on 200 Twitter topics that did trend and 200 that didn't. In real time, they set their algorithm loose on live tweets, predicting trending with 95 percent accuracy and a 4 percent false-positive rate.
Shah predicts, however, that the system's accuracy will improve as the size of the training set increases. "The training sets are very small," he says, "but we still get strong results."
Keeping pace
Of course, the larger the training set, the greater the computational cost of executing Shah and Nikolov's algorithm. Indeed, Shah says, curbing computational complexity is the reason that machine-learning algorithms typically employ parametric models in the first place. "Our computation scales proportionately with the data," Shah says.
But on the Web, he adds, computational resources scale with the data, too: As Facebook or Google add customers, they also add servers. So his and Nikolov's algorithm is designed so that its execution can be split up among separate machines. "It is perfectly suited to the modern computational framework," Shah says.
In principle, Shah says, the new algorithm could be applied to any sequence of measurements performed at regular intervals. But the correlation between historical data and future events may not always be as clear cut as in the case of Twitter posts. Filtering out all the noise in the historical data might require such enormous training sets that the problem becomes computationally intractable even for a massively distributed program. But if the right subset of training data can be identified, Shah says, "It will work."
###
Written by Larry Hardesty, MIT News Office
Predicting what topics will trend on Twitter
A new algorithm predicts which Twitter topics will trend hours in advance and offers a new technique for analyzing data that fluctuate over time
2012-11-01
ELSE PRESS RELEASES FROM THIS DATE:
Plants recognise pathogenic and beneficial microorganisms
2012-11-01
Plant roots are surrounded by thousands of bacteria and fungi living in the soil and on the root surface. To survive in this diverse environment, plants employ sophisticated detection systems to distinguish pathogenic microorganisms from beneficial microorganisms.
Here the so-called chitin molecules from microorganisms, along with modified versions, play an important role as they are detected by the plant surveillance system. Legumes, for example, build a defence against pathogenic microorganisms in response to simple chitin molecules.
However, when the plant detects ...
Great differences between EU Member States in how well transport systems cope with weather phenomena
2012-11-01
This is the first study in the world to evaluate the risks posed to transport by weather phenomena on a country-specific and mode-specific basis. Among the EU Member States, Poland has the highest risk level indicator. The highest-risk regions are in the countries of Eastern Europe and in mountainous areas. Low-risk countries include Ireland, Austria, Luxembourg and the Nordic countries.
The risk-level evaluation was conducted using a risk indicator designed by VTT scientists. The calculations were performed on substantial datasets and involved estimating the ...
Gen X overtaking baby boomers on obesity
2012-11-01
New research from the University of Adelaide shows that Australia's Generation X is already on the path to becoming more obese than their baby boomer predecessors.
Studies show that boomers currently have the highest level of obesity of any age group in Australia. However, new research by University of Adelaide PhD student Rhiannon Pilkington has revealed some alarming statistics. As part of her research, she has compared obesity levels between the two generations at equivalent ages.
Using data from the National Health Survey, Ms Pilkington compared Generation X in ...
UK butterfly populations threatened by extreme drought and landscape fragmentation
2012-11-01
A new study has found that the sensitivity and recovery of UK butterfly populations to extreme drought is affected by the overall area and degree of fragmentation of key habitat types in the landscape.
The analysis, published this week in the scientific journal Ecography, used data on the Ringlet butterfly collected from 79 UK Butterfly Monitoring Scheme sites between 1990 and 1999, a period which spanned a severe drought event in 1995.
The study was led by Dr Tom Oliver from the NERC Centre for Ecology & Hydrology (CEH) in collaboration with colleagues from CEH and ...
Inflammation and cognition in schizophrenia
2012-11-01
Philadelphia, PA, November 1, 2012 – There are a growing number of clues that immune and inflammatory mechanisms are important for the biology of schizophrenia. In a new study in Biological Psychiatry, Dr. Mar Fatjó-Vilas and colleagues explored the impact of the interleukin-1β gene (IL1β) on brain function alterations associated with schizophrenia.
Fatjó-Vilas said that "this study is a contribution to the relatively new field of 'functional imaging genetics' which appears to be potentially powerful for the study of schizophrenia, where genetic factors are ...
Bulletin: German nuclear exit delivers economic, environmental benefits
2012-11-01
Following the accident at the Fukushima Daiichi Nuclear Power Station in 2011, the German government took the nation's eight oldest reactors offline immediately and passed legislation that will close the last nuclear power plant by 2022. This nuclear phase-out had overwhelming political support in Germany. Elsewhere, many saw it as "panic politics," and the online business magazine Forbes.com went as far as to ask, in a headline, whether the decision was "Insane -- or Just Plain Stupid."
But a special issue of the Bulletin of the Atomic Scientists, published by SAGE, ...
Sleep problems cost billions
2012-11-01
If you can't sleep at night, you're not alone. Around ten per cent of the population suffer from insomnia, where you have trouble falling asleep, wake up frequently at night, and still feel tired when the morning comes.
– When you feel tired and indisposed, your performance at work suffers, says Børge Sivertsen, professor at UiB's Department of Clinical Psychology and senior researcher at the Norwegian Institute of Public Health.
Sleep apnoea is a more severe problem, affecting four to five per cent of the population. Sufferers can stop breathing for up to 40 seconds ...
Nereidum Montes helps unlock Mars' glacial past
2012-11-01
On 6 June, the high-resolution stereo camera on ESA's Mars Express revisited the Argyre basin as featured in our October release, but this time aiming at Nereidum Montes, some 380 km northeast of Hooke crater.
The stunning rugged terrain of Nereidum Montes marks the far northern extent of Argyre, one of the largest impact basins on Mars.
Nereidum Montes stretches almost 1150 km and was named by the noted Greek astronomer Eugène Michel Antoniadi (1870).
Based on his extensive observations of Mars, Antoniadi famously concluded that the 'canals' on Mars reported by Percival ...
African American women with HIV/HCV less likely to die from liver disease
2012-11-01
A new study shows that African American women coinfected with human immunodeficiency virus (HIV) and hepatitis C virus (HCV) are less likely to die from liver disease than Caucasian or Hispanic women. Findings in the November issue of Hepatology, a journal published by Wiley on behalf of the American Association for the Study of Liver Diseases, indicate that lower liver-related mortality in African American women was independent of other causes of death.
Medical evidence reports that nearly five million Americans are infected with HCV, with 80% having active virus in ...
Scientists create 'endless supply' of myelin-forming cells
2012-11-01
In a new study appearing this month in the Journal of Neuroscience, researchers have unlocked the complex cellular mechanics that instruct specific brain cells to continue to divide. This discovery overcomes a significant technical hurdle to potential human stem cell therapies; ensuring that an abundant supply of cells is available to study and ultimately treat people with diseases.
"One of the major factors that will determine the viability of stem cell therapies is access to a safe and reliable supply of cells," said University of Rochester Medical Center (URMC) ...
LAST 30 PRESS RELEASES:
A gender gap in using AI for research
Human-caused fires growing faster than lightning fires in the Western US
Barbeque and grandma’s cookies: New study looks at nostalgia, comfort in food preparation for older adults
The political consequences of undocumented residents in the census
Purity and environmental concern
Branch patterns in trees and art
Researcher develops method to measure blood-brain barrier permeability accurately
SynGAP Research Fund dba cure SYNGAP1 (SRF) announces the release of their SYNGAP1 impact report for 2024
Breakthrough in click chemistry: innovative method revolutionizes drug development
Digital Science announces Catalyst Grant winners, rewarding innovations to safeguard research integrity
How cancer cells trick the immune system by altering mitochondria
Poll: Most U.S. workers with chronic conditions manage them at work, haven’t told employer
Disruption of a single amino acid in a cellular protein makes breast cancer cells behave like stem cells
As more Americans work later in life, poll shows positive health impacts, especially for those over 65
Is the Metaverse a new frontier for human-centric manufacturing?
When qubits learn the language of fiberoptics
The prevalence of older Americans without disabilities increases substantially between 2008 and 2017
New study reveals hidden manic symptoms in one-fourth of schizophrenia patients
Does the universe behave the same way everywhere? Gravitational lenses could help us find out
Majority support moderation on social media platforms
Majority support moderation on social media platforms, global survey shows
Born too late? Climate change may be delaying births
Truly autonomous AI is on the horizon
California’s marine protected areas boost fish populations across the state
Poachers’ social media posts reveal alarming extent of illegal wildlife hunting in Lebanon
Examining the potential environmental effects of mining the world’s largest lithium deposit
Chicken ‘woody breast’ detection improved with advanced machine learning model
Around 1 in 5 UK medical students considers dropping out, study suggests
Poor childhood social and cognitive skills combo linked to teens’ poor exam results
Position menstrual cups carefully to avoid possible kidney problems, doctors urge
[Press-News.org] Predicting what topics will trend on TwitterA new algorithm predicts which Twitter topics will trend hours in advance and offers a new technique for analyzing data that fluctuate over time