(Press-News.org) Genome Research publishes online and in print today a special issue dedicated to The ENCODE (ENCyclopedia Of DNA Elements) Project, whose goal is to characterize all functional elements in the human genome. Since the completion of the pilot phase of the project in 2007, covering 1% of the genome, The ENCODE Consortium has fanned out across the genome to study function and regulation on an unprecedented scale. This special issue presents novel findings, methodologies, and resources from ENCODE that bring extensive insight to gene regulation and set the stage for future discoveries. In addition, the issue also contains commentary and perspectives on how our views of the genome have changed as a result of The ENCODE Project. The entire issue will be freely available online on September 6 to coordinate with additional ENCODE Consortium publications in Nature, Genome Biology, and other journals.
1. GENCODE presents the most detailed annotation of the genome yet
From the completion of the pilot phase of The ENCODE Project in 2007, it has been evident that there is much more to a gene than the just a sequence that codes for protein, changing our concept of what defines a gene. We now know that the genome is not a set of discrete genes, but rather a complex system of genes and regulatory regions, much of which is transcribed into RNA, including many RNAs that do not code for protein but have critical cellular functions.
When The ENCODE Project was launched, a subgroup of the project called The GENCODE Consortium was established to accurately map and annotate these complex features across the human genome, by both manual curation and computational methods. In this special issue, Harrow and colleagues of The GENCODE Consortium present the latest release of GENOCDE gene data, describing a wealth of new information that exceeds the depth of annotation of other community resources.
Also in this issue are detailed reports of experimental validations to complement the GENCODE gene data and novel strategies for further annotating the genome. Howald and colleagues developed the RT-PCR-seq method to show that a substantial portion of exons, the protein-coding regions of genes retained by splicing, are not well annotated by unbiased RNA-sequencing alone, requiring a more targeted strategy in combination.
GENCODE has mapped more than 9,500 long non-coding RNA (lncRNAs), but up until now, only about 100 have been characterized with cellular function. lncRNAs, which are transcribed in a range of human tissues and play roles in gene regulation, are particularly interesting because they do not seem to be as well-conserved evolutionarily, in contrast to conservation of genes that code for proteins. Derrien et al. have analyzed the GENCODE lncRNA annotations, integrating the lncRNA data with other ENCODE transcriptome and epigenome data, presenting the most comprehensive lncRNA annotation to date. The authors show that approximately one-third of lncRNAs have arisen in the primate lineage, suggesting that there may be important lncRNA functions yet to be discovered.
References:
Harrow et al., GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res doi: 10.1101/gr.135350.111
Howald et al., Combining RT-PCR-seq and RNA-seq to catalog all genic elements encoded in the human genome. Genome Res doi: 10.1101/gr.134478.111
Derrien et al., The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res doi: 10.1101/gr.132159.111
2. ENCODE studies clarify the murky world of RNAs
The ENCODE Project's efforts to annotate the genome include the sequencing of RNA, the message transcribed from DNA to code for proteins and perform other cellular functions. Splicing can produce different forms of action for that molecule that have varied biological functions but the mechanism and timing by which splicing occurs across the genome has remained poorly understood. Previous studies have shown that splicing can occur while the RNA is still being transcribed from its template.
Now, analyses by The ENCODE Consortium are shedding light on the scale of co-transcriptional splicing genome-wide. In this issue, Tilgner and colleagues analyzed sequencing data from RNA isolated in different regions of the cell, allowing them to define splicing events at different stages and measure which splicing events are occurring during transcription. They found that most RNAs are being spliced while they are transcribed, and interestingly, for lncRNAs, splicing occurs late, and in some cases, not at all.
In previous studies, researchers have found that another well-known class of small regulatory RNAs, called microRNAs (miRNAs), are in some cases generated by splicing (called mirtrons), in addition to the typical miRNA biogenesis pathway. Recently, hundreds of mirtrons were identified in model organisms, but the prevalence of mirtrons in mammals remained unknown. Utilizing the wealth of small RNA datasets produced by The ENCODE Consortium and specialized analysis tools, a study by Ladewig et al. in this issue identified more than 200 mammalian mirtrons, confirming some that had been previously identified and showing evidence for many more that have not been previously characterized, and revealing new insight into the evolution and biology of miRNAs.
References:
Tilgner et al., Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res doi: 10.1101/gr.134445.111
Ladewig et al., Discovery of hundreds of mirtrons in mouse and human small RNA data. Genome Res doi: 10.1101/gr.133553.111
3. New views of the genome's regulatory landscape
The ENCODE Project continues to illuminate the complex process of gene regulation and chromatin, the combination of DNA and protein that packages DNA in the nucleus. The scale of new data from The ENCODE Project is allowing more accurate characterization than ever of the factors that regulate gene expression. In this issue, Cheng and colleagues have applied a statistical model to the large-scale ENCODE gene expression and transcription factor binding datasets to assess the accuracy of gene expression prediction. Among a number of insights into the predictability of gene expression, their work suggests that gene expression differences in different cell lines are directly reflected in quantitative differences in transcription factor binding levels, challenging the classic "on" or "off" transcription factor binding model.
In addition to studies investigating the myriad transcription factors in the cell, researchers in The ENCODE Consortium are also investigating the function of specific factors genome-wide. Wang et al. present a genome-wide analysis in diverse cell types of the binding pattern of CTCF, a well-known insulator that can suppress the effect of regulatory enhancers on its target gene when bound, playing a role in a number of fundamental genomic processes. The team found that the binding pattern of CTCF is surprisingly plastic yet reproducible, and is significantly different between normal and immortal cells, a finding that could have important implications in cancer.
ENCODE studies are spurring the development of new methods to integrate large genome-wide datasets of different types and to overcome the limitations of current techniques. For example, to investigate the relationship between nucleosome remodeling, histone modifications, and transcription factor binding that governs gene regulation, Kundaje and colleagues have developed a new tool called the Clustered Aggregation Tool (CAGT). The method was applied to datasets of chromatin marks and transcription factor binding to generate an extensive catalog of histone modifications and nucleosome positioning around bound transcription factors. The analysis indicated that both histone modifications and the positions of nucleosomes around transcription factor binding sites are highly heterogeneous, a surprising finding that suggests the features of many regulatory elements are asymmetrical.
References:
Cheng et al., Understanding transcriptional regulation by integrative analysis of transcription factor binding data. Genome Res doi: 10.1101/gr.136838.111
Wang et al., Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res doi: 10.1101/gr.136101.111
Kundaje et al., Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements. Genome Res doi: 10.1101/gr.136366.111
4. Regulatory variation and the genetic basis of disease
The data and analyses of The ENCODE Project will help the research community to not only understand genome function, but also disease, with the aim of designing new strategies of treatment and prevention. Much effort in the last decade to understand the genetic basis of disease has been through genome-wide association studies. Many genetic variants found to associate with disease lie in non-coding regions and are relatively common in the population. This challenge in interpreting the data has highlighted the need to understand the influence of genetic variation on the function of genes and regulatory regions.
Two studies in this special ENCODE issue take a step forward in this effort, analyzing the potential functional consequences of individual genetic variants. In a paper from Vernot and colleagues, the most comprehensive assessment of human regulatory variation yet is presented by analyzing regulatory regions marked by DNase I hypersensitivity, an experimental property that indicates gene activity, and the whole-genome sequences of 53 people. The authors found that individuals are more likely to have functionally relevant variants in regulatory regions of DNA compared to protein-coding regions and provide further insights into patterns of regulatory variation at the individual and population levels.
The second study, by Boyle et al., utilized RegulomeDB, a database of ENCODE regulatory data among other sources, to analyze 69 whole-genome sequences and "score" genetic variants to isolate those that may be functionally important. The team identified thousands of potentially functional regulatory variants and estimate that the human genome harbors as much, if not more variation in regulatory regions and than protein-coding DNA. The authors expect this resource to facilitate the annotation of human genome sequences.
References:
Vernot et al., Personal and population genomics of human regulatory variation. Genome Res doi: 10.1101/gr.134890.111
Boyle et al., Annotation of functional variation in personal genomes using RegulomeDB. Genome Res doi: 10.1101/gr.137323.112
###
Please direct requests for pre-print copies of the manuscripts to Peggy Calicchia, Administrative Assistant, Genome Research (calicchi@cshl.edu; +1-516-422-4012). In addition to the 10 articles highlighted above, the following will also appear in the issue:
Frazer, Decoding the human genome. Genome Res doi: 10.1101/gr.146175.112
Stamatoyannopoulos, What does our genome encode? Genome Res doi: 10.1101/gr.146506.112
Chanock, Towards mapping the biology of the genome. Genome Res doi: 10.1101/gr.144980.112
Park et al., RNA Editing in the human ENCODE RNA-seq data. Genome Res doi: 10.1101/gr.134957.111
Bánfai et al., Long noncoding RNAs are rarely translated in two human cell lines. Genome Res doi: 10.1101/gr.134767.111
Charos et al., A highly integrated and complex PPARGC1A transcription factor binding network in HepG2 cells. Genome Res doi: 10.1101/gr.127761.111
Natarajan et al., Predicting cell-type-specific gene expression from regions of open chromatin. Genome Res doi: 10.1101/gr.135129.111
Arvey et al., Sequence and chromatin determinants of cell-type-specific transcription factor binding. Genome Res doi: 10.1101/gr.127712.111
Schaub et al., Linking disease associations with regulatory information in the human genome. Genome Res doi: 10.1101/gr.136127.111
Wang et al., Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res doi: 10.1101/gr.139105.112
Landt et al., ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res doi: 10.1101/gr.136184.111
About Genome Research:
Launched in 1995, Genome Research is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine. Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies.
About Cold Spring Harbor Laboratory Press:
Cold Spring Harbor Laboratory is a private, nonprofit institution in New York that conducts research in cancer and other life sciences and has a variety of educational programs. Its Press, originating in 1933, is the largest of the Laboratory's five education divisions and is a publisher of books, journals, and electronic media for scientists, students, and the general public.
Genome Research issues press releases to highlight significant research studies that are published in the journal.
The ENCODE Project publishes new genomic insights in special issue of Genome Research
2012-09-06
ELSE PRESS RELEASES FROM THIS DATE:
BUSM/VA researchers examine new PTSD diagnosis criteria
2012-09-06
(Boston) – Results of a study led by researchers at Boston University School of Medicine (BUSM) and the Veterans Affairs (VA) Boston Healthcare System indicate that the proposed changes to the diagnosis of post-traumatic stress disorder (PTSD) will not substantially affect the number of people who meet criteria for the disorder.
Mark W. Miller, PhD, associate professor at BUSM and a clinical research psychologist at the National Center for PTSD at VA Boston Healthcare System served as lead author of the study, which is published online in Psychological Trauma: Theory, ...
Seeing the birth of the universe in an atom of hydrogen
2012-09-06
Windows to the past, stars can unveil the history of our universe, currently estimated to be 14 billion years old. The farther away the star, the older it is — and the oldest stars are the most difficult to detect. Current telescopes can only see galaxies about 700 million years old, and only when the galaxy is unusually large or as the result of a big event like a stellar explosion.
Now, an international team of scientists led by researchers at Tel Aviv University have developed a method for detecting galaxies of stars that formed when the universe was in its infancy, ...
Genome-wide scan maps mutations in deadly lung cancers; reveals embryonic gene link
2012-09-06
Scientists have completed a comprehensive map of genetic mutations linked to an aggressive and lethal type of lung cancer.
Among the errors found in small cell lung cancers, the team of scientists, including those at the Johns Hopkins Kimmel Cancer Center, found an alteration in a gene called SOX2 associated with early embryonic development.
"Small cell lung cancers are very aggressive. Most are found late, when the cancer has spread and typical survival is less than a year after diagnosis," says Charles Rudin, M.D., Ph.D., professor of oncology at the Johns Hopkins ...
Hospital-acquired UTIs rarely reported in data used to implement penalties
2012-09-06
ANN ARBOR, Mich. — Aiming to cut expenses and improve care, a 2008 Medicare policy stopped paying hospitals extra to treat some preventable, hospital-acquired conditions – including urinary tract infections (UTIs) in patients after bladder catheters are placed.
But a statewide analysis by the University of Michigan shows there was very little change in hospital payment due to removing pay for hospital-acquired catheter-associated UTIs. For all adult hospital stays in Michigan in 2009, eliminating payment for this infection decreased hospital pay for only 25 hospital ...
LEDs winning light race to save energy, the environment
2012-09-06
RICHLAND, Wash. – Today's light-emitting diode light bulbs have a slight environmental edge over compact fluorescent lamps. And that gap is expected to grow significantly as technology and manufacturing methods improve in the next five years, according to a new report from the Department of Energy's Pacific Northwest National Laboratory and UK-based N14 Energy Limited.
"The light-emitting diode lamp is a rapidly evolving technology that, while already energy efficient, will become even more so in just a few short years," said Marc Ledbetter, who manages PNNL's solid-state ...
Exceptional upward mobility in the US is a myth, international studies show
2012-09-06
ANN ARBOR, Mich.—The rhetoric is relentless: America is a place of unparalleled opportunity, where hard work and determination can propel a child out of humble beginnings into the White House, or at least a mansion on a hill.
But the reality is very different, according to a University of Michigan researcher who is studying inequality across generations around the world.
"Especially in the United States, people underestimate the extent to which your destiny is linked to your background. Research shows that it's really a myth that the U.S. is a land of exceptional social ...
Wildlife Conservation Society releases list of Asian species at the conservation crossroads
2012-09-06
JEJU, SOUTH KOREA (September 5, 2012) — Will the tiger go the way of the passenger pigeon or be saved from extinction like the American bison?
The Wildlife Conservation Society (WCS) today released a list of Asian species that are at a conservation crossroads calling for governments to take immediate action with The Three R's Approach: Recognition, Responsibility, Recovery.
The list includes: the tiger, orangutans, Mekong giant catfish, Asian rhinos, Asian giant river turtles, and Asian vultures. The announcement was made at the IUCN's World Conservation Congress convening ...
Guys, take note: Male birth control pill may be ready soon, says Texas A&M professor
2012-09-06
Attention men: The day may be coming soon when you can take your own birth control pill with no side effects, according to a study done by a group of scientists that includes a Texas A&M University researcher.
Qinglei Li, an assistant professor in Texas A&M's College of Veterinary Medicine & Biomedical Sciences, is part of a team of researchers led by Martin Matzuk at Baylor College of Medicine and James Bradner at Dana-Farber Cancer Institute who made the discovery, and their work is published in the journal Cell.
Working on mice, the team found that a compound called ...
Major advances in understanding the regulation and organization of the human genome
2012-09-06
The National Human Genome Research Institute today announced the results of a five-year international study of the regulation and organization of the human genome. The project is named ENCODE, which stands for the Encyclopedia of DNA Elements. In conjunction with the release of those results, the Journal of Biological Chemistry has published a series of reviews that focus on several aspects of the findings.
"The ENCODE project not only generated an enormous body of data about our genome, but it also analyzed many issues to better understand how the genome functions in ...
Dinosaur die-out might have been second of 2 closely timed extinctions
2012-09-06
The most-studied mass extinction in Earth history happened 65 million years ago and is widely thought to have wiped out the dinosaurs. New University of Washington research indicates that a separate extinction came shortly before that, triggered by volcanic eruptions that warmed the planet and killed life on the ocean floor.
The well-known second event is believed to have been triggered by an asteroid at least 6 miles in diameter slamming into Mexico's Yucatán Peninsula. But new evidence shows that by the time of the asteroid impact, life on the seafloor – mostly species ...