PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

In search of the key word

Bursts of certain words within a text are what make them keywords

2012-07-18
(Press-News.org) Human beings have the ability to convert complex phenomena into a one-dimensional sequence of letters and put it down in writing. In this process, keywords serve to convey the content of the text. How letters and words correlate with the subject of a text is something Eduardo Altmann and his colleagues from the Max Planck Institute for the Physics of Complex Systems have studied with the help of statistical methods. They discovered that what denotes keywords is not the fact that they appear very frequently in a given text. It is that they are found in greater numbers only at certain points in the text. They also discovered that relationships exist between sections of text which are distant from each other, in the sense that they preferentially use the same words and letters.

The Dresden-based scientists mathematically studied the semantic properties of texts by translating ten different English texts into various codes. One of the chosen texts was the English edition of Leo Tolstoy's "War and Peace".

One example of what the scientists did was translate letters in a text into a binary sequence. They replaced all vowels with 1 and all consonants with 0. By employing additional mathematical functions, the scientists examined different levels of the text – both individual vowels and letters, as well as whole words – which had been translated into various codes. In so doing, it was possible to identify repeating patterns within the text as a whole. Such correlation within a text is referred to as long-range correlation. This indicates whether two letters located at arbitrarily distant points in the text are connected with each other. For example, when we find a letter "W" at a certain point, there is a measurably higher probability that we will find the letter "W" again a few pages later.

"Understandably enough, if a certain point in the book talks about war, there is a high probability that the word war will also appear a few pages later. What is surprising is that we also find this higher probability at the level of individual letters," says Altmann.

Keywords are more frequent in certain passages of text

The scientists found this long-range correlation not only between letters, but also within higher linguistic levels, such as words. Within individual levels, the correlation remains when looking at different texts. "What we find much more interesting is to examine how the correlation changes between the levels," says Altmann. Long-range correlation enables the scientists to draw conclusions about the extent to which certain words are connected to a topic. "Even the connection between a word and the letters it is composed of can be analysed in this way," explains Altmann.

Furthermore, the scientists also studied what is known as "burstiness", which describes whether increased occurrence of a pattern of characters is present in a passage of text. It shows, for instance, whether a word comes up at increased frequency in a certain text section. The more frequently a certain word is used in a passage, the more likely it is that that word is representative of a certain subject.

The scientists demonstrated that certain words come up repeatedly throughout a text, are however not present in bursts in a given text passage. Although these words do exhibit long-range correlation, they are not closely related to the topic at hand. "Articles are the best examples of these. They come up very frequently in every text, but they are not crucial in conveying a given topic," says Altmann.

Statistical text analysis works irrespective of language

Whereas both letters and words exhibit long-range correlation, it is rare for letters to appear in bursts at certain points in a text. "It is, in fact, very rare for a letter to be as closely connected with a topic as the word it forms a part of. In a manner of speaking, letters can be used more flexibly," explains Altmann. An "a", for example, can be a part of a great many words that have no connection with one and the same topic.

The scientists employed statistical text analysis as an easy way of identifying the defining words of a given text. "By so doing, it is absolutely irrelevant which language the text is written in. The only thing that matters is the story and not language-specific rules," says Altmann. Their findings could be used in future to improve Internet search engines, and they could also help to analyse texts and identify plagiarism.



INFORMATION:

Original publication:

Eduardo G. Altmann, Giampaolo Cristadoro and Mirko Degli Esposti
On the origin of long-range correlations in texts
PNAS, July 2, 2012, doi: 10.1073/pnas.1117723109



ELSE PRESS RELEASES FROM THIS DATE:

Marijuana use doubles risk of premature birth

2012-07-18
A large international study led by University of Adelaide researchers has found that women who use marijuana can more than double the risk of giving birth to a baby prematurely. Preterm or premature birth - at least three weeks before a baby's due date - can result in serious and life-threatening health problems for the baby, and an increased risk of health problems in later life, such as heart disease and diabetes. A study of more than 3000 pregnant women in Adelaide, Australia and Auckland, New Zealand has detailed the most common risk factors for preterm birth. ...

Infants' recognition of speech more sophisticated than previously known, NYU researchers find

2012-07-18
The ability of infants to recognize speech is more sophisticated than previously known, researchers in New York University's Department of Psychology have found. Their study, which appears in the journal Developmental Psychology, showed that infants, as early as nine months old, could make distinctions between speech and non-speech sounds in both humans and animals. "Our results show that infant speech perception is resilient and flexible," explained Athena Vouloumanos, an assistant professor at NYU and the study's lead author. "This means that our recognition of speech ...

Marriage has different meanings for blacks and whites

2012-07-18
EAST LANSING, Mich. — Black people who are married don't appear to live any longer than black couples who simply live together, suggesting marriage doesn't boost longevity for blacks the way it does for whites, according to a large national study led by Michigan State University. "This finding implies that marriage and cohabitation have very different meanings for blacks and whites," said MSU sociologist Hui Liu, the study's lead researcher. The study, in the Journal of Marriage and Family, is the first to document mortality differences between cohabiters and married ...

Study suggests moderate drinking lowers risk of developing rheumatoid arthritis in women

2012-07-18
A follow-up study of more than 34,000 women in Sweden has shown that moderate drinkers, in comparison with abstainers, were at significantly lower risk of developing rheumatoid arthritis (RA), an often serious and disabling type of arthritis. RA is known to relate to inflammation, and it is thought that this inflammation is blocked to some degree by the consumption of alcohol. In this study, women who consumed at least 4 drinks per week (with a drink being defined as containing 15 grams of alcohol) had 37% lower risk of developing RA than subjects reporting never drinking ...

World record: Scientists from northern Germany produce the lightest material in the world

2012-07-18
A network of porous carbon tubes that is three-dimensionally interwoven at nano and micro level – this is the lightest material in the world. It weights only 0.2 milligrams per cubic centimetre, and is therefore 75 times lighter than Styrofoam, but it is very strong nevertheless. Scientists of Kiel University (KU) and Hamburg University of Technology (TUHH) have named their joint creation "Aerographite". The scientific results were published as the title story in the scientific journal Advanced Materials on July, 3rd. Today (Tuesday, July 17th) it is presented to the public. The ...

Frog calls inspire a new algorithm for wireless networks

2012-07-18
Males of the Japanese tree frog have learnt not to use their calls at the same time so that the females can distinguish between them. Scientists at the Polytechnic University of Catalonia have used this form of calling behaviour to create an algorithm that assigns colours to network nodes – an operation that can be applied to developing efficient wireless networks. How can network nodes be coloured with the least possible number of colours without two consecutive nodes being the same colour? A team of researchers at the Polytechnic University of Catalonia have found a ...

Triggers study evaluates regular staff, ICU specialists

2012-07-18
BOSTON – A system of care focused on the detection and systematic assessment of patients with clinical instability can yield similar outcomes as rapid response teams staffed with trained intensive care specialists, a Beth Israel Deaconess Medical Center study has found. The analysis of 177,347 patients over a 59-month period was published online in Critical Care Medicine, the journal of the Society of Critical Care Medicine. Rapid Response Teams have become an important part of hospital care in recent years, sending critical care-trained responders to the bedside of decompensating ...

Americans support local control of schools

2012-07-18
EAST LANSING, Mich. — Despite criticism that local school boards are "dinosaurs" that need to be replaced, Americans support local control of their schools, Michigan State University education scholars argue in a new paper. The public believes that all three levels of government – local, state and federal – should be involved in education policy and that local officials should be in charge of day-to-day operations of the schools, said Rebecca Jacobsen, lead researcher on the project. Jacobsen, assistant professor of education, and doctoral student Andrew Saultz analyzed ...

Notre Dame, MIT economists demonstrate wage impacts of large microfinance program

2012-07-18
A major argument in favor of microfinance is that the poor who live in areas without banking services will gain higher returns on investments and increase their assets when provided with credit. But a notable new study from the Consortium on Financial Systems and Poverty presents some of the first real evidence of microfinance impacts and indicates that the true returns of expanding access to credit are much more complex. Some of the greatest benefits to alleviating poverty, the study suggests, may be in the impact the programs have on driving up wages. The research, ...

Female money doesn't buy male happiness

2012-07-18
Macho men whose partners earn more than they do have worse romantic relationships, in part because the difference in income is a strain for them, according to a new study by Patrick Coughlin and Jay Wade from Fordham University in the US. Conversely, men who are not so traditional in their masculinity do not place as much importance on the difference in income and, as a result, appear to have better quality relationships with their female partner. The work is published online in Springer's journal Sex Roles. The breadwinner role for men is still the accepted norm in marriage, ...

LAST 30 PRESS RELEASES:

More time spent on social media linked to steroid use intentions among boys and men

New study suggests a “kick it while it’s down” approach to cancer treatment could improve cure rates

Milken Institute, Ann Theodore Foundation launch new grant to support clinical trial for potential sarcoidosis treatment

New strategies boost effectiveness of CAR-NK therapy against cancer

Study: Adolescent cannabis use linked to doubling risk of psychotic and bipolar disorders

Invisible harms: drug-related deaths spike after hurricanes and tropical storms

Adolescent cannabis use and risk of psychotic, bipolar, depressive, and anxiety disorders

Anxiety, depression, and care barriers in adults with intellectual and developmental disabilities

Study: Anxiety, gloom often accompany intellectual deficits

Massage Therapy Foundation awards $300,000 research grant to the University of Denver

Gastrointestinal toxicity linked to targeted cancer therapies in the United States

Countdown to the Bial Award in Biomedicine 2025

Blood marker from dementia research could help track aging across the animal world

Birds change altitude to survive epic journeys across deserts and seas

Here's why you need a backup for the map on your phone

ACS Central Science | Researchers from Insilico Medicine and Lilly publish foundational vision for fully autonomous “Prompt-to-Drug” pharmaceutical R&D

Increasing the number of coronary interventions in patients with acute myocardial infarction does not appear to reduce death rates

Tackling uplift resistance in tall infrastructures sustainably

Novel wireless origami-inspired smart cushioning device for safer logistics

Hidden genetic mismatch, which triples the risk of a life-threatening immune attack after cord blood transplantation

Physical function is a crucial predictor of survival after heart failure

Striking genomic architecture discovered in embryonic reproductive cells before they start developing into sperm and eggs

Screening improves early detection of colorectal cancer

New data on spontaneous coronary artery dissection (SCAD) – a common cause of heart attacks in younger women

How root growth is stimulated by nitrate: Researchers decipher signalling chain

Scientists reveal our best- and worst-case scenarios for a warming Antarctica

Cleaner fish show intelligence typical of mammals

AABNet and partners launch landmark guide on the conservation of African livestock genetic resources and sustainable breeding strategies

Produce hydrogen and oxygen simultaneously from a single atom! Achieve carbon neutrality with an 'All-in-one' single-atom water electrolysis catalyst

Sleep loss linked to higher atrial fibrillation risk in working-age adults

[Press-News.org] In search of the key word
Bursts of certain words within a text are what make them keywords