Cold Spring Harbor Library  CSHL: Home
Explore the pages of life
Library Catalog
Search Databases
CSHL Author Publications
Protocols & E-Books
Citation Linker

Library Newsletter
LibGuides - Resources & Services Guides
What We Offer
Online Librarian
Staff Phone & Email
Interlibrary Loan (ILLiad)
Remote Access
Reserve a Study Space
Suggest a Purchase
NIH Public Access Policy
Institutional Repository
Genentech Center for the History of Molecular Biology & Biotechnology
Special Collections
Digital Collections
Oral History Collection
History Biotechnology Meeting
Sydney Brenner Scholarship
Archives' Blog

Archives Advisory Board
Institutional Archives
James D. Watson
Barbara McClintock
Alfred D. Hershey
Charles B. Davenport
Reginald G. Harris
Digital Collections
CSHL Symposia
Memory Board ®
Oral History
Honest Jim
Seeking the Secret of Life
Building Blocks of CSHL
Carnegie Building History
Celebrating 100 Years of Genetics
About Us
Hours & Directions
Ellen Brenner Fellowship
Latest CSHL Authors' Publications

CSHL Authors' Publications Database provides access to all articles published by Cold Spring Harbor Laboratory scientists (1892 - 2012).
We are in the process of creating a bio page for each CSHL Principal Investigator, including a link to their home pages, and a video clip of their current research.
Please contact the library for additions or comments.

Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression
Abstract: Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals substantial conservation of transcriptional programmes, and uncovers a distinct class of genes with levels of expression that have been constrained early in vertebrate evolution. This core set of genes captures a substantial fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with conserved epigenetic marking, as well as with characteristic post-transcriptional regulatory programme, in which sub-cellular localization and alternative splicing play comparatively large roles.
Pervouchine DD,
Djebali S,
Breschi A,
Davis CA,
Barja PP,
Dobin A,
Tanzer A,
Lagarde J,
Zaleski C,
See LH,
Fastuca M,
Drenkow J,
Wang H,
Bussotti G,
Pei B,
Balasubramanian S,
Monlong J,
Harmanci A,
Gerstein M,
Beer MA,
Notredame C,
Guigo R,
Gingeras TR

Nat Commun
  6 (0): 5903; Jan 13 2015

SeqHBase: a big data toolset for family based sequencing data analysis
Abstract: BACKGROUND: Whole-genome sequencing (WGS) and whole-exome sequencing (WES) technologies are increasingly used to identify disease-contributing mutations in human genomic studies. It can be a significant challenge to process such data, especially when a large family or cohort is sequenced. Our objective was to develop a big data toolset to efficiently manipulate genome-wide variants, functional annotations and coverage, together with conducting family based sequencing data analysis. METHODS: Hadoop is a framework for reliable, scalable, distributed processing of large data sets using MapReduce programming models. Based on Hadoop and HBase, we developed SeqHBase, a big data-based toolset for analysing family based sequencing data to detect de novo, inherited homozygous, or compound heterozygous mutations that may contribute to disease manifestations. SeqHBase takes as input BAM files (for coverage at every site), variant call format (VCF) files (for variant calls) and functional annotations (for variant prioritisation). RESULTS: We applied SeqHBase to a 5-member nuclear family and a 10-member 3-generation family with WGS data, as well as a 4-member nuclear family with WES data. Analysis times were almost linearly scalable with number of data nodes. With 20 data nodes, SeqHBase took about 5 secs to analyse WES familial data and approximately 1 min to analyse WGS familial data. CONCLUSIONS: These results demonstrate SeqHBases high efficiency and scalability, which is necessary as WGS and WES are rapidly becoming standard methods to study the genetics of familial disorders.
He M,
Person TN,
Hebbring SJ,
Heinzen E,
Ye Z,
Schrodi SJ,
McPherson EW,
Lin SM,
Peissig PL,
Brilliant MH,
ORawe J,
Robison RJ,
Lyon GJ,
Wang K

J Med Genet
  (0): Jan 13 2015
@ the CSHL Library & Archives
© 2012   CSHL Library and Archives   1 Bungtown Rd.   Cold Spring Harbor, NY 11724   Library Information 516-367-6872
CSHL Home Intranet Sitemap