PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Predicting what topics will trend on Twitter

A new algorithm predicts which Twitter topics will trend hours in advance and offers a new technique for analyzing data that fluctuate over time

2012-11-01
(Press-News.org) CAMBRIDGE, Mass. -- Twitter's home page features a regularly updated list of topics that are "trending," meaning that tweets about them have suddenly exploded in volume. A position on the list is highly coveted as a source of free publicity, but the selection of topics is automatic, based on a proprietary algorithm that factors in both the number of tweets and recent increases in that number.

At the Interdisciplinary Workshop on Information and Decision in Social Networks at MIT in November, Associate Professor Devavrat Shah and his student, Stanislav Nikolov, will present a new algorithm that can, with 95 percent accuracy, predict which topics will trend an average of an hour and a half before Twitter's algorithm puts them on the list — and sometimes as much as four or five hours before.

The algorithm could be of great interest to Twitter, which could charge a premium for ads linked to popular topics, but it also represents a new approach to statistical analysis that could, in theory, apply to any quantity that varies over time: the duration of a bus ride, ticket sales for films, maybe even stock prices.

Like all machine-learning algorithms, Shah and Nikolov's needs to be "trained": it combs through data in a sample set — in this case, data about topics that previously did and did not trend — and tries to find meaningful patterns. What distinguishes it is that it's nonparametric, meaning that it makes no assumptions about the shape of patterns.

Let the data decide

In the standard approach to machine learning, Shah explains, researchers would posit a "model" — a general hypothesis about the shape of the pattern whose specifics need to be inferred. "You'd say, 'Series of trending things … remain small for some time and then there is a step,'" says Shah, the Jamieson Career Development Associate Professor in the Department of Electrical Engineering and Computer Science. "This is a very simplistic model. Now, based on the data, you try to train for when the jump happens, and how much of a jump happens.

"The problem with this is, I don't know that things that trend have a step function," Shah explains. "There are a thousand things that could happen." So instead, he says, he and Nikolov "just let the data decide."

In particular, their algorithm compares changes over time in the number of tweets about each new topic to the changes over time of every sample in the training set. Samples whose statistics resemble those of the new topic are given more weight in predicting whether the new topic will trend or not. In effect, Shah explains, each sample "votes" on whether the new topic will trend, but some samples' votes count more than others'. The weighted votes are then combined, giving a probabilistic estimate of the likelihood that the new topic will trend.

In Shah and Nikolov's experiments, the training set consisted of data on 200 Twitter topics that did trend and 200 that didn't. In real time, they set their algorithm loose on live tweets, predicting trending with 95 percent accuracy and a 4 percent false-positive rate.

Shah predicts, however, that the system's accuracy will improve as the size of the training set increases. "The training sets are very small," he says, "but we still get strong results."

Keeping pace

Of course, the larger the training set, the greater the computational cost of executing Shah and Nikolov's algorithm. Indeed, Shah says, curbing computational complexity is the reason that machine-learning algorithms typically employ parametric models in the first place. "Our computation scales proportionately with the data," Shah says.

But on the Web, he adds, computational resources scale with the data, too: As Facebook or Google add customers, they also add servers. So his and Nikolov's algorithm is designed so that its execution can be split up among separate machines. "It is perfectly suited to the modern computational framework," Shah says.

In principle, Shah says, the new algorithm could be applied to any sequence of measurements performed at regular intervals. But the correlation between historical data and future events may not always be as clear cut as in the case of Twitter posts. Filtering out all the noise in the historical data might require such enormous training sets that the problem becomes computationally intractable even for a massively distributed program. But if the right subset of training data can be identified, Shah says, "It will work." ### Written by Larry Hardesty, MIT News Office


ELSE PRESS RELEASES FROM THIS DATE:

Plants recognise pathogenic and beneficial microorganisms

Plants recognise pathogenic and beneficial microorganisms
2012-11-01
Plant roots are surrounded by thousands of bacteria and fungi living in the soil and on the root surface. To survive in this diverse environment, plants employ sophisticated detection systems to distinguish pathogenic microorganisms from beneficial microorganisms. Here the so-called chitin molecules from microorganisms, along with modified versions, play an important role as they are detected by the plant surveillance system. Legumes, for example, build a defence against pathogenic microorganisms in response to simple chitin molecules. However, when the plant detects ...

Great differences between EU Member States in how well transport systems cope with weather phenomena

2012-11-01
This is the first study in the world to evaluate the risks posed to transport by weather phenomena on a country-specific and mode-specific basis. Among the EU Member States, Poland has the highest risk level indicator. The highest-risk regions are in the countries of Eastern Europe and in mountainous areas. Low-risk countries include Ireland, Austria, Luxembourg and the Nordic countries. The risk-level evaluation was conducted using a risk indicator designed by VTT scientists. The calculations were performed on substantial datasets and involved estimating the ...

Gen X overtaking baby boomers on obesity

2012-11-01
New research from the University of Adelaide shows that Australia's Generation X is already on the path to becoming more obese than their baby boomer predecessors. Studies show that boomers currently have the highest level of obesity of any age group in Australia. However, new research by University of Adelaide PhD student Rhiannon Pilkington has revealed some alarming statistics. As part of her research, she has compared obesity levels between the two generations at equivalent ages. Using data from the National Health Survey, Ms Pilkington compared Generation X in ...

UK butterfly populations threatened by extreme drought and landscape fragmentation

UK butterfly populations threatened by extreme drought and landscape fragmentation
2012-11-01
A new study has found that the sensitivity and recovery of UK butterfly populations to extreme drought is affected by the overall area and degree of fragmentation of key habitat types in the landscape. The analysis, published this week in the scientific journal Ecography, used data on the Ringlet butterfly collected from 79 UK Butterfly Monitoring Scheme sites between 1990 and 1999, a period which spanned a severe drought event in 1995. The study was led by Dr Tom Oliver from the NERC Centre for Ecology & Hydrology (CEH) in collaboration with colleagues from CEH and ...

Inflammation and cognition in schizophrenia

2012-11-01
Philadelphia, PA, November 1, 2012 – There are a growing number of clues that immune and inflammatory mechanisms are important for the biology of schizophrenia. In a new study in Biological Psychiatry, Dr. Mar Fatjó-Vilas and colleagues explored the impact of the interleukin-1β gene (IL1β) on brain function alterations associated with schizophrenia. Fatjó-Vilas said that "this study is a contribution to the relatively new field of 'functional imaging genetics' which appears to be potentially powerful for the study of schizophrenia, where genetic factors are ...

Bulletin: German nuclear exit delivers economic, environmental benefits

2012-11-01
Following the accident at the Fukushima Daiichi Nuclear Power Station in 2011, the German government took the nation's eight oldest reactors offline immediately and passed legislation that will close the last nuclear power plant by 2022. This nuclear phase-out had overwhelming political support in Germany. Elsewhere, many saw it as "panic politics," and the online business magazine Forbes.com went as far as to ask, in a headline, whether the decision was "Insane -- or Just Plain Stupid." But a special issue of the Bulletin of the Atomic Scientists, published by SAGE, ...

Sleep problems cost billions

Sleep problems cost billions
2012-11-01
If you can't sleep at night, you're not alone. Around ten per cent of the population suffer from insomnia, where you have trouble falling asleep, wake up frequently at night, and still feel tired when the morning comes. – When you feel tired and indisposed, your performance at work suffers, says Børge Sivertsen, professor at UiB's Department of Clinical Psychology and senior researcher at the Norwegian Institute of Public Health. Sleep apnoea is a more severe problem, affecting four to five per cent of the population. Sufferers can stop breathing for up to 40 seconds ...

Nereidum Montes helps unlock Mars' glacial past

Nereidum Montes helps unlock Mars glacial past
2012-11-01
On 6 June, the high-resolution stereo camera on ESA's Mars Express revisited the Argyre basin as featured in our October release, but this time aiming at Nereidum Montes, some 380 km northeast of Hooke crater. The stunning rugged terrain of Nereidum Montes marks the far northern extent of Argyre, one of the largest impact basins on Mars. Nereidum Montes stretches almost 1150 km and was named by the noted Greek astronomer Eugène Michel Antoniadi (1870). Based on his extensive observations of Mars, Antoniadi famously concluded that the 'canals' on Mars reported by Percival ...

African American women with HIV/HCV less likely to die from liver disease

2012-11-01
A new study shows that African American women coinfected with human immunodeficiency virus (HIV) and hepatitis C virus (HCV) are less likely to die from liver disease than Caucasian or Hispanic women. Findings in the November issue of Hepatology, a journal published by Wiley on behalf of the American Association for the Study of Liver Diseases, indicate that lower liver-related mortality in African American women was independent of other causes of death. Medical evidence reports that nearly five million Americans are infected with HCV, with 80% having active virus in ...

Scientists create 'endless supply' of myelin-forming cells

2012-11-01
In a new study appearing this month in the Journal of Neuroscience, researchers have unlocked the complex cellular mechanics that instruct specific brain cells to continue to divide. This discovery overcomes a significant technical hurdle to potential human stem cell therapies; ensuring that an abundant supply of cells is available to study and ultimately treat people with diseases. "One of the major factors that will determine the viability of stem cell therapies is access to a safe and reliable supply of cells," said University of Rochester Medical Center (URMC) ...

LAST 30 PRESS RELEASES:

Rice statistician earns $1 million CPRIT award to advance AI-powered precision medicine for prostate cancer

Whose air quality are we monitoring?

Team Hope rides (again) for cancer research at the Tour de Scottsdale

Researchers find missing link in autoimmune disorder

‘Democratizing chemical analysis’: FSU chemists use machine learning and robotics to identify chemical compositions from images

Leveraging data science for disease prediction in the fight against rheumatoid arthritis

Kennedy Krieger screening model improves early autism diagnosis for underserved communities

Blood pressure patterns during pregnancy predict later hypertension risk, study finds

Latest Alzheimer’s drug shown less effective in females than males

Moffitt study finds vaccine may improve breast cancer treatment outcomes

Adoption of international auditing standards leads to better financial reporting

Internal displacement in Syria used to reshape the country’s political and social landscape, new study shows

Building a safer future: Rice researcher works to strengthen Haiti’s earthquake resilience

Diverging views of democracy fuel support for authoritarian politicians, Notre Dame study shows

Bacteria invade brain after implanting medical devices

New platform lets anyone rapidly prototype large, sturdy interactive structures

Non-genetic theories of cancer address inconsistencies in current paradigm

Food and non-alcoholic drink products in Mexico were substantially reformulated to be healthier following the 2020 introduction of warning labels identifying products with excessive content of calorie

Conservation efforts are bringing species back from the brink, even as overall biodiversity falls

Conservation efforts analysis reveals which actions are most helpful for endangered species status

JSCAI special issue explores the transformative role of artificial intelligence in interventional cardiology

Wayne State University research making strides in autonomous vehicle and machine systems to make them safer, more effective

Thorny skates come in snack and party sizes. After a century of guessing, scientists now know why.

When did human language emerge?

Meteorites: A geologic map of the asteroid belt

Study confirms safety and efficacy of higher-dose-per-day radiation for early-stage prostate cancer

Virginia Tech researchers publish revolutionary blueprint to fuse wireless technologies and AI

Illinois study: Extreme heat impacts dairy production, small farms most vulnerable

Continuous glucose monitors can optimize diabetic ketoacidosis management

Time is not the driving influence of forest carbon storage, U-M study finds

[Press-News.org] Predicting what topics will trend on Twitter
A new algorithm predicts which Twitter topics will trend hours in advance and offers a new technique for analyzing data that fluctuate over time