PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Predicting what topics will trend on Twitter

A new algorithm predicts which Twitter topics will trend hours in advance and offers a new technique for analyzing data that fluctuate over time

2012-11-01
(Press-News.org) CAMBRIDGE, Mass. -- Twitter's home page features a regularly updated list of topics that are "trending," meaning that tweets about them have suddenly exploded in volume. A position on the list is highly coveted as a source of free publicity, but the selection of topics is automatic, based on a proprietary algorithm that factors in both the number of tweets and recent increases in that number.

At the Interdisciplinary Workshop on Information and Decision in Social Networks at MIT in November, Associate Professor Devavrat Shah and his student, Stanislav Nikolov, will present a new algorithm that can, with 95 percent accuracy, predict which topics will trend an average of an hour and a half before Twitter's algorithm puts them on the list — and sometimes as much as four or five hours before.

The algorithm could be of great interest to Twitter, which could charge a premium for ads linked to popular topics, but it also represents a new approach to statistical analysis that could, in theory, apply to any quantity that varies over time: the duration of a bus ride, ticket sales for films, maybe even stock prices.

Like all machine-learning algorithms, Shah and Nikolov's needs to be "trained": it combs through data in a sample set — in this case, data about topics that previously did and did not trend — and tries to find meaningful patterns. What distinguishes it is that it's nonparametric, meaning that it makes no assumptions about the shape of patterns.

Let the data decide

In the standard approach to machine learning, Shah explains, researchers would posit a "model" — a general hypothesis about the shape of the pattern whose specifics need to be inferred. "You'd say, 'Series of trending things … remain small for some time and then there is a step,'" says Shah, the Jamieson Career Development Associate Professor in the Department of Electrical Engineering and Computer Science. "This is a very simplistic model. Now, based on the data, you try to train for when the jump happens, and how much of a jump happens.

"The problem with this is, I don't know that things that trend have a step function," Shah explains. "There are a thousand things that could happen." So instead, he says, he and Nikolov "just let the data decide."

In particular, their algorithm compares changes over time in the number of tweets about each new topic to the changes over time of every sample in the training set. Samples whose statistics resemble those of the new topic are given more weight in predicting whether the new topic will trend or not. In effect, Shah explains, each sample "votes" on whether the new topic will trend, but some samples' votes count more than others'. The weighted votes are then combined, giving a probabilistic estimate of the likelihood that the new topic will trend.

In Shah and Nikolov's experiments, the training set consisted of data on 200 Twitter topics that did trend and 200 that didn't. In real time, they set their algorithm loose on live tweets, predicting trending with 95 percent accuracy and a 4 percent false-positive rate.

Shah predicts, however, that the system's accuracy will improve as the size of the training set increases. "The training sets are very small," he says, "but we still get strong results."

Keeping pace

Of course, the larger the training set, the greater the computational cost of executing Shah and Nikolov's algorithm. Indeed, Shah says, curbing computational complexity is the reason that machine-learning algorithms typically employ parametric models in the first place. "Our computation scales proportionately with the data," Shah says.

But on the Web, he adds, computational resources scale with the data, too: As Facebook or Google add customers, they also add servers. So his and Nikolov's algorithm is designed so that its execution can be split up among separate machines. "It is perfectly suited to the modern computational framework," Shah says.

In principle, Shah says, the new algorithm could be applied to any sequence of measurements performed at regular intervals. But the correlation between historical data and future events may not always be as clear cut as in the case of Twitter posts. Filtering out all the noise in the historical data might require such enormous training sets that the problem becomes computationally intractable even for a massively distributed program. But if the right subset of training data can be identified, Shah says, "It will work." ### Written by Larry Hardesty, MIT News Office


ELSE PRESS RELEASES FROM THIS DATE:

Plants recognise pathogenic and beneficial microorganisms

Plants recognise pathogenic and beneficial microorganisms
2012-11-01
Plant roots are surrounded by thousands of bacteria and fungi living in the soil and on the root surface. To survive in this diverse environment, plants employ sophisticated detection systems to distinguish pathogenic microorganisms from beneficial microorganisms. Here the so-called chitin molecules from microorganisms, along with modified versions, play an important role as they are detected by the plant surveillance system. Legumes, for example, build a defence against pathogenic microorganisms in response to simple chitin molecules. However, when the plant detects ...

Great differences between EU Member States in how well transport systems cope with weather phenomena

2012-11-01
This is the first study in the world to evaluate the risks posed to transport by weather phenomena on a country-specific and mode-specific basis. Among the EU Member States, Poland has the highest risk level indicator. The highest-risk regions are in the countries of Eastern Europe and in mountainous areas. Low-risk countries include Ireland, Austria, Luxembourg and the Nordic countries. The risk-level evaluation was conducted using a risk indicator designed by VTT scientists. The calculations were performed on substantial datasets and involved estimating the ...

Gen X overtaking baby boomers on obesity

2012-11-01
New research from the University of Adelaide shows that Australia's Generation X is already on the path to becoming more obese than their baby boomer predecessors. Studies show that boomers currently have the highest level of obesity of any age group in Australia. However, new research by University of Adelaide PhD student Rhiannon Pilkington has revealed some alarming statistics. As part of her research, she has compared obesity levels between the two generations at equivalent ages. Using data from the National Health Survey, Ms Pilkington compared Generation X in ...

UK butterfly populations threatened by extreme drought and landscape fragmentation

UK butterfly populations threatened by extreme drought and landscape fragmentation
2012-11-01
A new study has found that the sensitivity and recovery of UK butterfly populations to extreme drought is affected by the overall area and degree of fragmentation of key habitat types in the landscape. The analysis, published this week in the scientific journal Ecography, used data on the Ringlet butterfly collected from 79 UK Butterfly Monitoring Scheme sites between 1990 and 1999, a period which spanned a severe drought event in 1995. The study was led by Dr Tom Oliver from the NERC Centre for Ecology & Hydrology (CEH) in collaboration with colleagues from CEH and ...

Inflammation and cognition in schizophrenia

2012-11-01
Philadelphia, PA, November 1, 2012 – There are a growing number of clues that immune and inflammatory mechanisms are important for the biology of schizophrenia. In a new study in Biological Psychiatry, Dr. Mar Fatjó-Vilas and colleagues explored the impact of the interleukin-1β gene (IL1β) on brain function alterations associated with schizophrenia. Fatjó-Vilas said that "this study is a contribution to the relatively new field of 'functional imaging genetics' which appears to be potentially powerful for the study of schizophrenia, where genetic factors are ...

Bulletin: German nuclear exit delivers economic, environmental benefits

2012-11-01
Following the accident at the Fukushima Daiichi Nuclear Power Station in 2011, the German government took the nation's eight oldest reactors offline immediately and passed legislation that will close the last nuclear power plant by 2022. This nuclear phase-out had overwhelming political support in Germany. Elsewhere, many saw it as "panic politics," and the online business magazine Forbes.com went as far as to ask, in a headline, whether the decision was "Insane -- or Just Plain Stupid." But a special issue of the Bulletin of the Atomic Scientists, published by SAGE, ...

Sleep problems cost billions

Sleep problems cost billions
2012-11-01
If you can't sleep at night, you're not alone. Around ten per cent of the population suffer from insomnia, where you have trouble falling asleep, wake up frequently at night, and still feel tired when the morning comes. – When you feel tired and indisposed, your performance at work suffers, says Børge Sivertsen, professor at UiB's Department of Clinical Psychology and senior researcher at the Norwegian Institute of Public Health. Sleep apnoea is a more severe problem, affecting four to five per cent of the population. Sufferers can stop breathing for up to 40 seconds ...

Nereidum Montes helps unlock Mars' glacial past

Nereidum Montes helps unlock Mars glacial past
2012-11-01
On 6 June, the high-resolution stereo camera on ESA's Mars Express revisited the Argyre basin as featured in our October release, but this time aiming at Nereidum Montes, some 380 km northeast of Hooke crater. The stunning rugged terrain of Nereidum Montes marks the far northern extent of Argyre, one of the largest impact basins on Mars. Nereidum Montes stretches almost 1150 km and was named by the noted Greek astronomer Eugène Michel Antoniadi (1870). Based on his extensive observations of Mars, Antoniadi famously concluded that the 'canals' on Mars reported by Percival ...

African American women with HIV/HCV less likely to die from liver disease

2012-11-01
A new study shows that African American women coinfected with human immunodeficiency virus (HIV) and hepatitis C virus (HCV) are less likely to die from liver disease than Caucasian or Hispanic women. Findings in the November issue of Hepatology, a journal published by Wiley on behalf of the American Association for the Study of Liver Diseases, indicate that lower liver-related mortality in African American women was independent of other causes of death. Medical evidence reports that nearly five million Americans are infected with HCV, with 80% having active virus in ...

Scientists create 'endless supply' of myelin-forming cells

2012-11-01
In a new study appearing this month in the Journal of Neuroscience, researchers have unlocked the complex cellular mechanics that instruct specific brain cells to continue to divide. This discovery overcomes a significant technical hurdle to potential human stem cell therapies; ensuring that an abundant supply of cells is available to study and ultimately treat people with diseases. "One of the major factors that will determine the viability of stem cell therapies is access to a safe and reliable supply of cells," said University of Rochester Medical Center (URMC) ...

LAST 30 PRESS RELEASES:

Scientists wash away mystery behind why foams are leakier than expected

TIFRH researchers uncover a mechanism enabling glasses to self-regulate their brittleness

High energy proton accelerator on a table-top — enabled by university class lasers

Life, death and mowing – study reveals Britain’s poetic obsession with the humble lawnmower

Ochsner Transplant Institute’s kidney program achieves ELITE Status

Gender differences in primary care physician earnings and outcomes under Medicare Advantage value-based payment

Can mindfulness combat anxiety?

Could personality tests help make bipolar disorder treatment more precise?

Largest genomic study of veterans with metastatic prostate cancer reveals critical insights for precision medicine

UCF’s ‘bridge doctor’ combines imaging, neural network to efficiently evaluate concrete bridges’ safety

Scientists discover key gene impacts liver energy storage, affecting metabolic disease risk

Study finds that individual layers of synthetic materials can collaborate for greater impact

Researchers find elevated levels of mercury in Colorado mountain wetlands

Study reveals healing the ozone hole helps the Southern Ocean take up carbon

Ultra-robust hydrogels with adhesive properties developed using bamboo cellulose-based carbon nanomaterials

New discovery about how acetaminophen works could improve understanding about pain relievers

What genetic changes made us uniquely human? -- The human intelligence evolved from proximal cis-regulatory saltations

How do bio-based amendments address low nutrient use efficiency and crop yield challenges?

Predicting e-bus battery performance in cold climates: a breakthrough in sustainable transit

Enhancing centrifugal compressor performance with ported shroud technology

Can localized fertilization become a key strategy for green agricultural development?

Log in to your computer with a secret message encoded in a molecule

In healthy aging, carb quality counts

Dietary carbohydrate intake, carbohydrate quality, and healthy aging in women

Trends in home health care among traditional Medicare beneficiaries with or without dementia

Thousands of cardiac ‘digital twins’ offer new insights into the heart

Study reveals impacts of Alzheimer’s disease on the whole body

A diabetes paradox: Improved health has not boosted workforce prospects

USTC achieves krypton-81 dating of 1-kilogram Antarctic ice

Novel method for satellite 3D component layout optimization based on mixed integer programming

[Press-News.org] Predicting what topics will trend on Twitter
A new algorithm predicts which Twitter topics will trend hours in advance and offers a new technique for analyzing data that fluctuate over time