(Press-News.org) MADISON, Wis. -- In 1997, IBM's Deep Blue computer beat chess wizard Garry Kasparov. This year, a computer system developed at the University of Wisconsin-Madison equaled or bested scientists at the complex task of extracting data from scientific publications and placing it in a database that catalogs the results of tens of thousands of individual studies.
"We demonstrated that the system was no worse than people on all the things we measured, and it was better in some categories," says Christopher Ré, who guided the software development for a project while a UW professor of computer sciences.
The development, described in the current issue of PLoS, marks a milestone in the quest to rapidly and precisely summarize, collate and index the vast output of scientists around the globe, says first author Shanan Peters, a professor of geoscience at UW-Madison.
Peters and colleagues set up the faceoff between PaleoDeepDive, their new machine reading system, and the human scientists who had manually entered data into the Paleobiology Database. This repository, compiled by hundreds of researchers, is the destination for data from paleontology studies funded by the National Science Foundation and other agencies internationally.
The knowledge produced by paleontologists is fragmented into hundreds of thousands of publications. Yet many research questions require what Peters calls a "synthetic approach: For example, how many species were on the planet at any given time?"
Teaming up with Ré, who is now at Stanford University, and UW-Madison computer sciences professor Miron Livny, the group built on the DeepDive machine reading system and the HTCondor distributed job management system to create PaleoDeepDive. "We were lucky that Miron Livny brought the high throughput computing capabilities of the UW-Madison campus to bear," says Peters. "Getting started required a million hours of computer time."
PaleoDeepDive mimics the human activities needed to assemble the Paleobiology Database. "We extracted the same data from the same documents and put it into the exact same structure as the human researchers, allowing us to rigorously evaluate the quality of our system, and the humans," Peters says.
Instead of trying to divine the single correct meaning, the tactic was to "to look at the entire problem of extraction as a probabilistic problem," says Ré, who credits much of the heavy lifting to UW-Madison Ph.D. candidate Ce Zhang.
Computers often have trouble deciphering even simple-sounding statements, Ré says. Ré imagines a study containing the terms "Tyrannosaurus rex" and "Alberta, Canada." Is Alberta where the fossil was found, or where it is stored? "We take a more relaxed approach: There is some chance that these two are related in this manner, and some chance they are related in that manner."
In these large-data tasks, PaleoDeepDive has a major advantage, Peters says. "Information that was manually entered into the Paleobiology Database by humans cannot be assessed or enhanced without going back to the library and re-examining original documents. Our machine system, on the other hand, can extend and improve results essentially on the fly as new information is added."
Further advantages can result from improvements in the computer tools. "As we get more feedback and data, it will do a better job across the board," Peters says.
The machine-reading trial required access to tens of thousands of articles, says Jacquelyn Crinion, assistant director of licensing and acquisitions services at the UW-Madison General Library System. And the download volume threatened logjams in document delivery. Eventually, Elsevier gave the UW-Madison team broad access to 10,000 downloads per week.
As text- and data-mining takes off, Crinion says the library system and publishers will adapt. "The challenge for all of us is to provide specialized services for researchers while continuing to meet the core needs of the vast majority of our customers."
The Paleobiology Database has already generated hundreds of studies about the history of life, Peters says. "Ultimately, we hope to have the ability to create a computer system that can do almost immediately what many geologists and paleontologists try to do on a smaller scale over a lifetime: read a bunch of papers, arrange a bunch of facts, and relate them to one another in order to address big questions."
INFORMATION:
CONTACT: Shanan Peters, peters@geology.wisc.edu, 608-262-5987 (prefers email for first contact)
-- David Tenenbaum, 608-265-8549, djtenenb@wisc.edu
MADISON, Wis. -- If Brad Singer knew for sure what was happening three miles under an odd-shaped lake in the Andes, he might be less eager to spend a good part of his career investigating a volcanic field that has erupted 36 times during the last 25,000 years. As he leads a large scientific team exploring a region in the Andes called Laguna del Maule, Singer hopes the area remains quiet.
But the primary reason to expend so much effort on this area boils down to one fact: The rate of uplift is among the highest ever observed by satellite measurement for a volcano that ...
ANN ARBOR--As much as two-thirds of Earth's carbon may be hidden in the inner core, making it the planet's largest carbon reservoir, according to a new model that even its backers acknowledge is "provocative and speculative."
In a paper scheduled for online publication in the Proceedings of the National Academy of Sciences this week, University of Michigan researchers and their colleagues suggest that iron carbide, Fe7C3, provides a good match for the density and sound velocities of Earth's inner core under the relevant conditions.
The model, if correct, could help ...
LA JOLLA, CA - December 1, 2014 - Researchers can now explore viruses, bacteria and components of the human body in more detail than ever before with software developed at The Scripps Research Institute (TSRI).
In a study published December 1 in the journal Nature Methods, the researchers demonstrated how the software, called cellPACK, can be used to model viruses such as HIV.
"We hope to ultimately increase scientists' ability to target any disease," said Art Olson, professor and Anderson Research Chair at TSRI who is senior author of the new study.
Putting cellPACK ...
MADISON, Wis. - In 1997, IBM's Deep Blue computer beat chess wizard Gary Kasparov. This year, a computer system developed at the University of Wisconsin-Madison achieved something far more complex. It equaled or bested scientists at the complex task of extracting data from scientific publications and placing it in a database that catalogs the results of tens of thousands of individual studies.
"We demonstrated that the system was no worse than people on all the things we measured, and it was better in some categories," says Christopher Ré, who guided the software ...
LEXINGTON, KY. (Dec. 1, 2014) -- A group of physiologists led by University of Kentucky's Tim McClintock have identified the receptors activated by two odors using a new method that tracks responses to smells in live mice.
Their research was published in the latest edition of The Journal of Neuroscience.
Using a fluorescent protein to mark nerve cells activated by odors, McClintock and his coworkers identified the receptors that allow mouse nerve cells to respond to two odors: eugenol, which is a component of several spices, most notably cloves, and muscone, known ...
When people hear about the dangers of the ozone hole, they often think of sunburns and associated health risks, but new research shows that ozone depletion changes atmospheric and oceanic circulation with potentially devastating effects on weather in the Southern Hemisphere weather.
These could include increased incidence of extreme events, resulting in costly floods, drought, wildfires, and serious environmental damage. The ecosystem impacts documented so far include changes to growth rates of South American and New Zealand trees, decreased growth of Antarctic mosses, ...
Inundation of nitrogen into the atmosphere and terrestrial environments, through fossil fuel combustion and extensive fertilization, has risen tenfold since preindustrial times according to research published in Global Biogeochemical Cycles. Excess nitrogen can infiltrate water tables and can trigger extensive algal blooms that deplete aquatic environments of oxygen, among other damaging effects.
Although scientists have extensively studied the effects of excess nitrogen in terrestrial habitats, the effect on the open ocean remains unknown. Altieri et al. point out that ...
Through research in mice, scientists have found that proteins at the blood-brain barrier pump out riluzole, the only FDA-approved drug for ALS, or Lou Gehrig's disease, limiting the drug's effectiveness. However, when the investigators blocked these proteins, the effectiveness of riluzole increased and the animals experienced improved muscle function, slower disease progression, and prolonged survival.
The findings suggest that blocking these transporter proteins at the blood-brain barrier might improve delivery, and ultimately, efficacy, of drugs used to treat ALS and ...
The first long-term clinical trial on the use of Lung Volume Reduction (LVR-) Coil treatment in patients with severe emphysema has found that the minimally-invasive therapy, which enables the lung to function more effectively, is safe over a 3-year period. The results are published in Respirology.
The trial revealed that half of the patients continued to improve their lung function capacity, feelings of breathlessness, and overall quality of life after 3 years, with no unexpected safety issues.
"This trial reports only the first ever treated patients in the world with ...
December 1, 2014 -- A study just released by Columbia University's Mailman School of Public Health compared the use of prescription opioids and stimulants among high school graduates, non-graduates, and their college-attending peers, and found that young adults who do not attend college are at particularly high risk for nonmedical prescription opioid use and disorder. In contrast, the nonmedical use of prescription stimulants is higher among college-educated young adults. Results of the study are published online in the journal Social Psychiatry and Psychiatric Epidemiology.
Non-medical ...