(Press-News.org) A new study finds that papers with data shared in public gene expression archives received increased numbers of citations for at least five years. The large size of the study allowed the researchers to exclude confounding factors that have plagued prior studies of the effect and to spot a trend of increasing dataset reuse over time. The findings will be important in persuading scientists that they can benefit directly from publicly sharing their data.
The study, which adds to growing evidence for an open data citation benefit across different scientific fields, is entitled "Data reuse and the open citation advantage". It was conducted by Dr. Heather Piwowar of Duke University and Dr. Todd Vision of the University of North Carolina at Chapel Hill, and published today in PeerJ, a peer reviewed open access journal in which all articles are freely available to everyone.
The study examined citations to over ten thousand articles that generated new gene expression data, a quarter of which had data publicly archived in the GEO and ArrayExpress repositories. Papers with publicly available data received about 9% more citations overall, with the difference increasing over time. The researchers concluded that much of this citation difference was due to actual data reuse.
"Professional advancement in science is still highly dependent on how well your paper gets cited, even in a field like genomics where the data underlying that paper may have far more scientific impact over the long term." said Dr. Vision, a biologist affiliated with the National Evolutionary Synthesis Center and the Dryad Digital Repository. "Until the happy day when hiring and promotion committees catch up with how to value data sharing for its own sake, it is comforting to know that scientists can still receive credit for data sharing in a currency that counts."
The researchers also mined the full text of articles for references to dataset identifiers in order to study trends in data reuse directly. They took the unusual step of discussing the obstacles they encountered in the paper. Dr. Piwowar, at the time of the study a postdoc with the DataONE project, said "We need more open and cohesive infrastructure to support collecting evidence about the process and products of science. This evidence is needed to inform important policy decisions. For example, data archiving requirements, infrastructure, and education should be informed by evidence about how data is and is not reused."
The mined references revealed that scientists generally stopped publishing papers using their own datasets within two years, while other scientists continued to reuse their data for at least six years. It also showed that data reuse is on the rise. "Not only were the number of reuse papers higher", says Dr. Piwowar, "but analyses from 2002 to 2004 were reusing only one or two datasets, while a quarter of the studies by 2010 were using three or more."
INFORMATION:
EMBARGOED until Oct 1st 2013: 7 am EST; 12 midday UK time
Link to the PDF of this Press Release: http://bit.ly/PeerJPiwowar
Link to the Press Preview of the Original Article (this link should only be used BEFORE the embargo ends): http://static.peerj.com/press/previews/2013/10/175.pdf (note: this is an author proof and so may change slightly before publication)
Link to the Published Version of the article (quote this link in your story – the link will ONLY work after the embargo lifts): https://peerj.com/articles/175 - your readers will be able to freely access this article at this URL.
PeerJ encourages authors to publish the full peer reviews, and author rebuttals, for their article. For the purposes of due diligence by the Press, we can provide these materials as a PDF (and they will be published alongside the final article). Please contact us at press@peerj.com to request a copy of the reviews.
Citation to the article: Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage. PeerJ 1:e175 http://dx.doi.org/10.7717/peerj.175
Other Information: The raw data behind this study are publicly available in the Dryad Digital Repository at http://doi.org/10.5061/dryad.781pv. This link will only work after Oct 1st
Funding: This study was funded by U.S. National Science Foundation grants to the DataONE (OCI-0830944) and Dryad (DBI-0743720) projects, and a Discovery grant to Michael Whitlock from the Natural Sciences and Engineering Research Council of Canada.
About PeerJ
PeerJ is an Open Access publisher of peer reviewed articles, which offers researchers a lifetime membership, for a single low price, giving them the ability to openly publish all future articles for free. The launch of PeerJ occurred on February 12th, 2013. PeerJ is based in San Francisco, CA and London, UK and can be accessed at https://peerj.com/.
All works published in PeerJ are Open Access and published using a Creative Commons license (CC-BY 3.0). Everything is immediately available—to read, download, redistribute, include in databases and otherwise use—without cost to anyone, anywhere, subject only to the condition that the original authors and source are properly attributed.
PeerJ Media Resources (including logos) can be found at: https://peerj.com/about/press/
Media Contacts
For the Authors:
Dr Heather Piwowar
Email: hpiwowar@gmail.com
For PeerJ:
press@peerj.com
https://peerj.com/about/press/
Abstract (from the article)
Background. Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the "citation benefit". Furthermore, little is known about patterns in data reuse over time and across datasets.
Method and Results. Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties.
Conclusion. After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.
Scientists who share data publicly receive more citations
Long-lived citation benefit, and an increase in data reuse over time is seen for gene expression studies
2013-10-01
ELSE PRESS RELEASES FROM THIS DATE:
Report: Breast cancer incidence rates converging among white and African-American women
2013-10-01
ATLANTA -- Breast cancer incidence rates increased slightly among African American women from 2006 to 2010, bringing those rates closer to the historically higher rates among white women, according to a new analysis by American Cancer Society researchers. The explanation behind the rise is unclear.
The finding is published in Breast Cancer Statistics, 2013 published in CA: A Cancer Journal for Clinicians, a peer-reviewed journal of the American Cancer Society. The report and its consumer version, Breast Cancer Facts & Figures 2013-2014, are published biennially and provide ...
Statin medications may prevent dementia and memory loss with longer use
2013-10-01
A review of dozens of studies on the use of statin medications to prevent heart attacks shows that the commonly prescribed drugs pose no threat to short-term memory, and that they may even protect against dementia when taken for more than one year. The Johns Hopkins researchers who conducted the systematic review say the results should offer more clarity and reassurance to patients and the doctors who prescribe the statin medications.
The question of whether statins can cause cognition problems has become a hot topic among cardiologists and their patients following changes ...
Blood-pressure drug may help improve cancer treatment
2013-10-01
Use of existing, well-established hypertension drugs could improve the outcome of cancer chemotherapy by opening up collapsed blood vessels in solid tumors. In their report in the online journal Nature Communications, Massachusetts General Hospital (MGH) investigators describe how the angiotensin inhibitor losartan improved the delivery of chemotherapy drugs and oxygen throughout tumors by increasing blood flow in mouse models of breast and pancreatic cancer. A clinical trial based on the findings of this study is now underway.
"Angiotensin inhibitors are safe blood ...
Caribou may be indirectly affected by sea-ice loss in the Arctic
2013-10-01
Melting sea ice in the Arctic may be leading, indirectly, to fewer caribou calf births and higher calf mortality in Greenland, according to scientists at Penn State University. Eric Post, a Penn State University professor of biology, and Jeffrey Kerby, a Penn State graduate student, have linked the melting of Arctic sea ice with changes in the timing of plant growth on land, which in turn is associated with lower production of calves by caribou in the area. The results of the study will be published in the journal Nature Communications on 1 October 2013. Five photos of ...
Despite growth reports, Africa mired in poverty
2013-10-01
EAST LANSING, Mich. — Despite continued reports of economic growth in Africa, much of the continent remains wracked by poverty, with roughly one in five citizens saying they frequently lack food, clean water and medical care, according to the largest survey of African citizens.
This suggests the growth is not trickling down to the poorest citizens or that actual growth rates are inflated, said Carolyn Logan, assistant professor of political science at Michigan State University and deputy director of the survey, called the Afrobarometer.
"The survey results show there ...
The phytonutrients in oats and their role in human health: A review of the evidence
2013-10-01
ALBUQUERQUE, New Mexico, October 1, 2013 – Oats may deserve the well-earned status of "super grain", according to research presented at the American Association of Cereal Chemists International annual meeting, being held this week in Albuquerque, NM. World-renowned grain researchers presented compelling data to support the important role that oats can play in improving diet quality and supporting human health.
As a part of the Quaker Oats Center of Excellence's aim to elevate the relevance and benefits of oats through science, agriculture and innovation, YiFang Chu, ...
Vandetanib: IQWiG assessed data subsequently submitted by the manufacturer
2013-10-01
Vandetanib (trade name: Caprelsa) has been approved in Germany since February 2012 for the treatment of adult patients who have a particular form of aggressive thyroid cancer. On the inclusion of additional study data subsequently provided by the drug manufacturer in the commenting procedure, the German Institute for Quality and Efficiency in Health Care (IQWiG) came to a different conclusion in an addendum: According to the findings, there is a hint of a minor added benefit in people aged under 65 years, but a hint of greater harm (lesser benefit) in older patients in ...
Fertility problems? Joining the 'breakfast club' can help
2013-10-01
Jerusalem, Oct. 1, 2013 -- A new study by researchers at the Hebrew University of Jerusalem and Tel Aviv University reveals that eating a good breakfast can have a positive impact on women with problems of infertility.
In recent years, nutritional research has found that our weight is affected not only by the level of calorie intake, but also by the question of when to consume large amounts of calories.
Now, research, conducted by Prof. Oren Froy, director of the Nutrigenomics and Functional Foods Research Center at the Robert H. Smith Faculty of Agriculture, Food ...
Body image impacts on weight gain during pregnancy
2013-10-01
How women perceive their bodies during pregnancy and how that impacts on their weight gain has been the subject of a new study by University of Adelaide researchers.
Researchers in the University's Robinson Institute and the School of Psychology have studied more than 400 South Australian women to better understand the links between body image and excessive weight gain during pregnancy.
The results, published in the journal Women and Birth, show that more than 70% of pregnant women who are overweight or obese under-estimate their weight. Those who under-estimate their ...
Scientists tap into spinal response from gastric reflux
2013-10-01
University of Adelaide researchers have made advances in the understanding of one of the world's most common medical conditions, gastric reflux, and how patients experience pain from it.
Gastric reflux affects as many as one in five people in Western countries and is on the increase in Asia. Diet and lifestyle, as well as genetic and hormonal issues, are commonly considered to be major causes of gastric reflux.
In laboratory studies, researchers have identified the nerve pathways in the spinal cord that transmit pain signals associated with gastric reflux to the brain.
"This ...
LAST 30 PRESS RELEASES:
New therapy reduces reoffending in male offenders with antisocial personality disorder
We are no longer living longer, UEA study shows
Study on new telerehabilitation stroke therapy model led by UTHealth Houston for underserved community in the Texas Rio Grande Valley
Study reveals genes that may help predict prostate cancer outcomes
Obesity surgery tourism – only approved centres should be carrying out recognised procedures to avoid further tragedies
Medicaid telehealth reimbursement policies are exacerbating workforce shortages in safety net clinics, study finds
Texas McCombs faculty research hits historic high
Multiple sclerosis: Cell-catching implant helps identify successful treatment in mice
Q&A: Is it always ‘us vs them’? Researcher explains why flexibility is key
New nanoscale technique unlocks quantum material secrets
New study uncovers how genes influence retinal aging and brain health
‘False’ springs, long summers mean uncertainty for NY grape growers
A treatment-resistant, severe type of asthma successfully modeled in mice
Cholesterol metabolism byproduct linked to Parkinson’s disease
The capsid of the virus-derived retrotransposon Copia, a parasitic genome element, mediates synaptic plasticity at the Drosophila neuromuscular junction
Sweet molasses feed key to understanding grazing behavior in cattle
Fabio Boschini, first INRS researcher to receive an Alfred P. Sloan Fellowship
Biomedicine shows the way to future food crops
First 5 regions chosen to focus innovative effort on diagnosing, treating CKM syndrome
Kahramanmaraş earthquake study showcases potential slip rate errors
Abortion changes among residents of an abortion rights protective state
Tobacco and e-product use by US adults with disabilities
New microactuator driving system could give microdrones a jump-start
Racial disparities seen in same-day breast diagnostic and biopsy services
Researchers develop AI model to automatically segment MRI images
Racial disparities seen in care after abnormal mammograms
New research brings hope for improved outcomes and survival rates for patients facing a pancreatic cancer diagnosis
Using CRISPR to remove extra chromosomes in Down syndrome
Social media posts and transformer-based models for early detection of heat stroke
Restoring grasslands led to fewer human-wildlife conflicts in Kenya, research finds
[Press-News.org] Scientists who share data publicly receive more citationsLong-lived citation benefit, and an increase in data reuse over time is seen for gene expression studies