(Press-News.org) Keeping up with current scientific literature is a daunting task, considering that hundreds to thousands of papers are published each day. Now researchers from North Carolina State University have developed a computer program to help them evaluate and rank scientific articles in their field.
The researchers use a text-mining algorithm to prioritize research papers to read and include in their Comparative Toxicogenomics Database (CTD), a public database that manually curates and codes data from the scientific literature describing how environmental chemicals interact with genes to affect human health.
"Over 33,000 scientific papers have been published on heavy metal toxicity alone, going as far back as 1926," explains Dr. Allan Peter Davis, a biocuration project manager for CTD at NC State who worked on the project and co-lead author of an article on the work. "We simply can't read and code them all. And, with the help of this new algorithm, we don't have to."
To help select the most relevant papers for inclusion in the CTD, Thomas Wiegers, a research bioinformatician at NC State and the other co-lead author of the report, developed a sophisticated algorithm as part of a text-mining process. The application evaluates the text from thousands of papers and assigns a relevancy score to each document. "The score ranks the set of articles to help separate the wheat from the chaff, so to speak," Wiegers says.
But how good is the algorithm at determining the best papers? To test that, the researchers text-mined 15,000 articles and sent a representative sample to their team of biocurators to manually read and evaluate on their own, blind to the computer's score. "The results were impressive," Davis says. The biocurators concurred with the algorithm 85 percent of the time with respect to the highest-scored papers.
Using the algorithm to rank papers allowed biocurators to focus on the most relevant papers, increasing productivity by 27 percent and novel data content by 100 percent. "It's a tremendous time-saving step," Davis explains. "With this we can allocate our resources much more effectively by having the team focus on the most informative papers."
There are always outliers in these types of experiments: occasions where the algorithm assigns a very high score to an article that a human biocurator quickly dismisses as irrelevant. The team that looked at those outliers was often able to see a pattern as to why the algorithm mistakenly identified a paper as important. "Now, we can go back and tweak the algorithm to account for this and fine-tune the system," Wiegers says.
"We're not at the point yet where a computer can read and extract all the relevant data on its own," Davis concludes, "but having this text-mining process to direct us toward the most informative articles is a huge first step."
###
The paper, "Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the Comparative Toxicogenomics Database," was published online April 17 in PLOS ONE. Co-authors are Dr. Cindy Murphy, a biocurator scientist at NC State; Dr. Carolyn Mattingly, associate professor of biology at NC State; and Drs. Robin Johnson, Jean Lay, Kelley Lennon-Hopkins, Cindy Saraceni-Richards and Daniela Sciaky from The Mount Desert Island Biological Laboratory. The work was supported by the National Institute of Environmental Health Sciences.
New algorithm helps evaluate, rank scientific literature
2013-04-18
ELSE PRESS RELEASES FROM THIS DATE:
Scientists throw new light on DNA copying process
2013-04-18
Research led by a scientist at the University of York has thrown new light on the way breakdowns in the DNA copying process inside cells can contribute to cancer and other diseases.
Peter McGlynn, an Anniversary Professor in the University's Department of Biology, led a team of researchers who have discovered that the protein machines that copy DNA in a model organism pause frequently during this copying process, creating the potential for dangerous mutations to develop.
The research, which is published in the Proceedings of the National Academy of Sciences (PNAS), ...
Discovery paves the way for ultra fast high resolution imaging in real time
2013-04-18
Ultrafast high-resolution imaging in real time could be a reality with a new research discovery led by the University of Melbourne.
In work published in Nature Communications, researchers from the University of Melbourne and the ARC Centre for Excellence in Coherent Xray Science have demonstrated that ultra short durations of electron bunches generated from laser-cooled atoms can be both very cold and ultra-fast.
Lead researcher Associate Professor Robert Scholten said the surprising finding was an important step towards making ultrafast high-resolution electron imaging ...
Previously unpublished paper by Francis Crick and Jeffries Wyman, 'A Footnote on Allostery'
2013-04-18
Cambridge, MA, April 18, 2013 - It is rare that an unpublished piece of research or theory remains significant after half a century. It is also a wonderful example of the boundless curiosity of the late Francis Crick. A previously unpublished work by Francis Crick and Jeffries Wyman from 1965 is now available, together with Jean-Pierre Changeux's recollections on the origins of the theory of Allostery and several important texts by various authors on the subject. These are part of a special issue of the Journal of Molecular Biology (JMB) published at the occasion of a Pasteur/EMBO ...
Study says more efforts needed to regulate dietary supplements
2013-04-18
TORONTO, April 18, 2013—Dietary supplements accounted for more than half the Class 1 drugs recalled by the U.S. Food and Drug Administration from 2004-12, meaning they contained substances that could cause serious health problems or even death, a new study from St. Michael's Hospital has found.
The majority of those recalled supplements were bodybuilding, weight loss or sexual enhancement products that contain unapproved medicinal ingredients, including steroids, said the study's lead author, Dr. Ziv Harel.
Almost one-quarter of the substances are manufactured outside ...
The exciting life cycle of a new Brazilian leaf miner
2013-04-18
A new species of leaf miner from the important family Gracillariidae has been recently discovered in the depths of the Brazilian jungle and described in the open access journal Zookeys. The Gracillariidae family is an ancient one with fossils dating back its origins to around 97 million years. Among the leaf miners of this family there are some species with significance as economic factors as well as agricultural pests. The new species, Spinivalva gaucha, is associated with representatives of the Passiflora plant genus among which is the economically important and favorite ...
Science surprise: Toxic protein made in unusual way may explain brain disorder
2013-04-18
ANN ARBOR, Mich. — A bizarre twist on the usual way proteins are made may explain mysterious symptoms in the grandparents of some children with mental disabilities.
The discovery, made by a team of scientists at the University of Michigan Medical School, may lead to better treatments for older adults with a recently discovered genetic condition.
The condition, called Fragile X-associated Tremor Ataxia Syndrome (FXTAS), causes shakiness and balance problems and is often misdiagnosed as Parkinson's disease. The grandchildren of people with the disease have a separate ...
Silly phone game puts illiterate Pakistanis in touch with potential employers
2013-04-18
PITTSBURGH—A silly telephone game that became a viral phenomenon in Pakistan has demonstrated some serious potential for teaching poorly educated people about automated voice services and provided a new tool for them to learn about jobs, say researchers at Carnegie Mellon University and Pakistan's Lahore University of Management Sciences (LUMS).
The game, called Polly, is simplicity itself: a caller records a message and Polly adds funny sound effects, such as changing a male's voice to a female voice (or vice versa), or making the caller sound like a drunk chipmunk. ...
Natura 2000 networks: Improving current methods in biodiversity conservation
2013-04-18
The world's biodiversity is currently in rapid decline, with human-mediated global change being a principal cause. Europe is no exception, and the Natura 2000 network provides an important conservation tool for biodiversity on a European level. It forms a network of natural and semi-natural sites within the region with high heritage values due to the exceptional flora and fauna they contain. The goal of the Natura 2000 network is to maintain the biological diversity of environments, while taking into account economic, social, cultural and regional logic of sustainable development. ...
Cross-cultural similarities in early adolescence
2013-04-18
This press release is available in French.
Montreal, April 18, 2013 – Acquiring self-esteem is an important part of a teenager's development. The way in which adolescents regard themselves can be instrumental in determining their achievement and social functioning. New research from Concordia University shows that the way in which adolescents think about themselves varies across cultural context.
To compare how teenagers assess their self-worth, William M. Bukowski, a psychology professor and director of the Centre for Research in Human Development, examined responses ...
High levels of glutamate in brain may kick-start schizophrenia
2013-04-18
New York, NY (April 18, 2013) — An excess of the brain neurotransmitter glutamate may cause a transition to psychosis in people who are at risk for schizophrenia, reports a study from investigators at Columbia University Medical Center (CUMC) published in the current issue of Neuron.
The findings suggest 1) a potential diagnostic tool for identifying those at risk for schizophrenia and 2) a possible glutamate-limiting treatment strategy to prevent or slow progression of schizophrenia and related psychotic disorders.
"Previous studies of schizophrenia have shown that ...