CoreGenomics: February 2015

Saturday, 28 February 2015

AGBT 2015 review...sort of

Another fun year at AGBT in sunny Marco Island. There has been some good blog coverage this year, but not from me; I've been Tweeting instead @CIgenomics on #AGBT15, see what happened on Twitter.

The UCSC Twitter track

Blog coverage:
GenomeWeb were there for those who could not make it - thanks team GenomeWeb, see Here, and here.
OmicsOmics: Keith had several posts (always worth reading)
Genohub: day1, day2, day3
NextGenSeq: has a preview post and covers Craig Venter's and Gene Meyer's talks.
DeciBio: had a piece on the rumours of AGBT15
Pacific Biosciences: day1, day2, their workshop
Illumina: covered their NeoPrep launch, but not their customers presentations
Ion Torrent: had several posts by Dale Yazuki (scooterdog), e.g. here.

I had a great time some of my favourite talks were:
-->

Yaniv Erlich, New York Genome/Columbia University "Dissecting the Genetic Architecture of Longevity Using Massive-Scale Crowd-Sourced Genealogy"
Maryke Appel, KAPA Biosystems "The Evolution of Library Prep: Selecting for Both Quality and Speed"
Roman Yelensky, Foundation Medicine "A Novel Next Generation Sequencing (NGS)-Based Companion Diagnostic Predicts Response to the PARP Inhibitor Rucaparib in Ovarian Cancer"
Michael Fischback, University of California, San Francisco "Insights From a Global View of Secondary Metabolism: Small Molecules From the Human Microbiota"
David Epstein, ProPublica ProPublica, "Dangerous Dichotomies: How Sports Harm Genetic Research"
Michael Evan Macosko, Harvard Medical School "DropSeq: A Droplet-Based Technology for Single-Cell mRNA-Seq Analysis on a Massive Scale"
Christopher Mason, Weill Cornell Medical College "City-Scale DNA Dynamics, Disease Surveillance, and Metagenomics Profiling"
Iain Macaulay, Wellcome Trust Sanger Institute "G&T-seq: Separation and Parallel Sequencing of the Genomes and Transcriptomes of Single Cells"
Tatiana Moroz, University of Florida "Space Genomics: Epigenomic Mechanisms for Adaptations to Microgravity"

10X genomics: And 10X genomics of course...

Changing the Definition of Sequencing from 10X Genomics on Vimeo.

-->

Wednesday, 25 February 2015

AGBT 2015

The meeting kicks off today with Illumina's user meeting and I'm going to give "Tweeting the meeting" a go this year. Keep an eye on @CIgenomics and #AGBT15
James.

Thursday, 12 February 2015

What's coming up at AGBT 2015

We'll soon be back in sunny Florida (current forecast is low to mid 20's) for another cram packed Advances in Genome & Biotechnology meeting. Of course the focus is still on sequencing, but with instruments like to CyTof coming along and huge improvements in total proteome analysis, how long the Genome stay top of the heap is not clear.

Again the agenda is very full, and full of interesting stuff too! The standout presentation title has to be Saturdays talk at, 11:40–12:00 by Tatiana Moroz (University of Florida): Space Genomics: Epigenomic Mechanisms for Adaptations to Microgravity; other highlights for me are:

Thursday, February 26th:

7:30 p.m. – 7:50 p.m.Hie Lim Kim, Nanyang Technological University Khoisan Hunter-Gatherers Have Been the Largest Population Throughout Most of Modern Human Demographic History.

7:50 p.m. – 8:10 p.m.Karyn Meltz Steinberg, The Genome Institute at Washington University FinMetSeq: Exome Sequencing of 20,000 Finns Identity Known and Novel Associations With Cardiometabolic Traits

8:50 p.m. - 9:10 p.m.Richard Leggett, The Genome Analysis Centre Towards Real-Time Surveillance Approaches Using Nanopore Sequencers

9:10 p.m. – 9:30 p.m.Nicholas Navin, MD Anderson Cancer Center Single Cell Sequencing Identifies Clonal Stasis and Punctuated Copy Number Evolution in Triple-Negative Breast Cancer Patients

Friday, February 27th:
11:00 a.m. – 11:20 a.m.* Stephen Kingsmore, Children’s Mercy Kansas City Newborn Sequencing in Genomic Medicine and Public Health: Rapid Genome Sequencing for Genetic Disease Diagnosis in Neonatal Intensive Care Units

8:30 p.m. – 8:50 p.m.Lia Chappell, Wellcome Trust Sanger Institute Revealing Malaria Parasite Transcriptomes Using Directional, Amplification-Free RNA-seq

8:50 p.m. – 9:10 p.m.Roman Yelensky, Foundation Medicine A Novel Next Generation Sequencing (NGS)-Based Companion Diagnostic Predicts Response to the PARP Inhibitor Rucaparib in Ovarian Cancer

See you there.

Tuesday, 10 February 2015

What can you read if you're new to NGS?

A recent BioTechniues practical guide: Libraryconstruction for next-generation sequencing: Overviews and challenges, written by Steven Head et al from the The ScrippsResearch Institute NGS and Microarray Core Facility; might be just the thing to hand out to your NGS newbies as it covers pretty much every aspect of library they need to know!

The focus is very much on the options and challenges users face when making decisions about how to make NGS libraries. The article has sections on fragmentation; DNA library prep; RNA-seq library prep; complexity, bias and batch effects; target capture/amplification; mate-pair library prep; ChIP-seq library prep; RIP-seq/CLIP-seq and finally Methylation sequencing.

DNA fragmentation: This section covers all the major options, physical or enzymatic and discusses the importance of making sure the insert size is correct for your application. An interesting comment was that the lab has successfully clustered and sequenced libraries with 1500bp inserts!

Complexity, bias, and batch effects: The section discusses the importance of understanding the biases in the methods being used and of good experimental design. The use of duplicate reads to measure library complexity covers the important points of read-depth and sampling error. And the authors present the basic methods as applied to genomes where nucleic acids are present in roughly equimolar ratios; and discuss the caveats of applying the same methods to RNA-seq or ChIP-seq, where they most certainly are not. There is only a brief mention of molecular indexing and the potential impact this has on NGS analysis. The first take home message is to minimise batch effects and PCR – but there is no discussion about the quantification of your final libraries being a good place to make decisions on reducing PCR cycles. The second is that user should “keep in mind the general principle that more starting material means less amplification and thus [usually] better library complexity”, just because a kit can work with 50ng or RNA does not mean you should use 50ng when you can easily get 500ng!

Target capture/amplification: This section covers the major methods for in-solution target capture. But it is a bit light on amplicon sequencing methods, there are many companies out there selling amplicon kits and I’d suggest people look at the Fluidigm Access Array and Wafergen SmartChip for amplicons on Illumina, or even combine Ampliseq with Nextera XT.

Single-cells: The review only briefly mentions single cells and the Fluidigm C1 system. I suspect we will not have long to wait before we start to see similar papers focused on single-cell NGS.

“When in doubt, consulting a statistician during the experimental design process can save an enormous amount of wasted money and time.”

Friday, 6 February 2015

Single cell sequencing by Affy!

Cellular Research released their latest development: Resolve for single cell mRNA-seq sample-prep at under £1 per cell. A paper in today's Science describes the method: Combinatorial labeling of single cells for gene expression cytometry. Using CytoSeq (why not Cyto-seq) 10,000 or 100,000 cells can be analysed. Like the C1 cells need to be flow sorted, but unlike the C1 CytoSeq does not apply as stringent a restriction on cell size or morphology - if you can sort it, CytoSeq can sequence it. The paper presents data from several hematopoietic systems but solid tissue, e.g. tumours, should be analysable if they can be mechanically or enzymaticly disaggregated.

Who Are Cellular Research: The company was set up by Steve Fodor (hence the Affy link in the title of this post) in 2011 at the same time a PNAS paper first described the molecular indexing approach. I was first alerted to Cellular Research by a contact who'd moved from Fluidigm in 2013. The publication in 2014 of a PNAS paper by Glenn Fu on molecular indexing in RNA-seq showed what Cellular Research might be delivering, and in the past few months we've begun testing of the Precise assay for targeted RNA-seq. The workflow in the lab is great and expected costs are just £10 per sample for up to 130 genes.

Precise assay workflow

How does Cyto-seq work: Cells in suspension are loaded into 20 picolitre wells, such that most wells are empty but those that do contain cells only have one. Oligonucleotide coated beads deliver the molecular index for each cell, and the molecular indexes for the mRNAs at the same time. mRNAs bind to the oligos ready for 1st strand cDNA synthesis; and similarly to the Precise protocol all 10,000 cells are pooled for downstream processing as a single reaction through reverse transcription, cDNA amplification and finally sequencing. Figure 1 from the Science paper describes the basic approach.

The first experiment described in the Science paper was a mixture analysis of K562 (myelogenous leukemia) and Ramos (Burkitt’s lymphoma) cells using a panel of 12 genes: five genes specific for K562 cells, six genes specific for Ramos cells, and the common housekeeping gene GAPDH. Other experiments reported against panels of 93, 98 and 111 genes. Not quite whole transcriptome, but only 1-5M reads per experiment makes CytoSeq the first single-cell transcriptome MiSeq application.

The method again uses the power of molecular indexing to tag multiple cDNAs from single cells and apply unique indexes to both mRNAs and the cells they come from. You'll be able to run CytoSeq in your lab from 2016 when Cellular Research will release an instrument to perform the library-prep workflow in cartridges of 5-10,000 cells per run.

Figure 1 from Fu et al 2015.

How much might Cyto-seq cost: The combination of a Resolve cartridge for 10,000 cells at £1 each plus a single MiSeq run at £600 comes to a little over £10,000 for a 100 gene panel. It is not clear how scalable the number of genes is and whole transcriptome may be a ways off yet. But assuming you can do this, and you stick with the 1-3M reads per cell that the major single-cell labs (and Fluidigm) are suggesting then each cell would cost about £3-5 to sequence on HiSeq 2500 today. So a 10,000 cell CytoSeq total mRNA-seq experiment would cost £6000 for library prep and £30,000+ for sequencing (price per M reads here). Not cheap, but the impact on some biological questions will be impressive, and new questions can be asked if we can do this kind of work routinely.

Clincal applications of CytoSeq: Will the method be applicable to blood cancers as a new screening tool? Will gene expression analysis of disaggregated solid tumours be possible in real time and at a cost that can make an impact on patient care? I am sure people are already working on these kind of questions.

Why is molecular indexing important: I think molecular indexing is a big leap forward for NGS. Being able to clearly identify single-molecules on an Illumina sequencer means the need to develop single molecule sequencers is significantly lessened for most of us. Molecular indexes should allow us to reduce the impact of technical artifacts in PCR amplification and resolve copy-number amplifications and deletions, mRNA DGA, Chromatin binding peaks, and exome allele calls much better than we can today.

Monday, 2 February 2015

Comparing Illumina's sequencers

Updated: The cost per M reads and cost per Gb figures in the original posting were wrong - damned Excel operator error! I've fixed them, again. Thanks to Shawn for his comments.

I've been asked about the difference between the Illumina sequencer line-up so many times that I put together a spreadsheet to help the discussions. This is cobbled together from the Illumina website and there are no prices quoted, however I have estimated the £ per M reads and the £ per GB.

Pages