Friday, 20 March 2015

Oxford Nanopore MinION for ctDNA sequencing

A great poster at AGBT was presented by Boreal Genomics and available on the Nanopore wiki for MAPpers. In A nanopore liquid biopsy Patrick Davies describes their combination of the Boreal On-Target with ONT MinION sequencing to detect mutant allele fractions in ctDNA of sub 0.1%. I spoke briefly to Andre Marziali (Boreal Founder & CSO) about the work and summarise the poster here.

Saturday, 14 March 2015

Fancy working in my lab?

I've currently got three positions open in my lab and thought I'd use this blog as another way to get the message out to prospective candidates. Two people recently moved onto new jobs; one in Inivata (the first spin-out from CRUK-CI) and one at AbCam, and another person was recently promoted. We're also busy so we're also recruiting for a six-month temporary contract to help out with the sequencing services.

If you want to see what the lab does please take a look at our lab website and you may have seen us on Twitter.

The posts:

The posts will all be involved in providing Next-Generation Sequencing and library preparation services; including nucleic acid and library quant (with KAPA); setting up, monitoring and troubleshooting Illumina HiSeq, NextSeq and MiSeq sequencers; and library prep using a diversity of methods, such as Exome-seq, ChIPseq, and RNASeq - we do a lot of RNA-seq and Exomes. The senior post wil be responsible for the day-today operational management of the NGS service, and will work alongside their counterpart running the library prep services.

The Genomics core has been operational for 8 years and we've focused on NGS for 7 of those; it is an experienced lab doing exciting work with a diverse set of users from across the Cambridge, although our primary focus is on Cancer Research methods for scientists at the CRUK funded Cambridge Institute.

Please follow the links for details on applications rather than contacting me directly.



PS: Closing date for all posts is 27 March 2015.

Friday, 13 March 2015

A better way to sequence exomes?

I caught up with a new company on the target capture scene, Directed Genomics, at AGBT. Their approach is based on a simple idea: if you want to sequence exomes, why not capture only exons?

Most exome-seq methods (Illumina, Agilent, Nimblegen) use oligo-baits to pull-down adapter-ligated fragment libraries, with fragments of 200-300bp. As exons are only 170bp long (80–85% Human exons less than 200bp Zhu et al & Sakharkar et al) we sequence lots of near- or off-target bases. These can be used (cnvOffSeq for instance), but are to some degree wasted sequencing.

The Directed Genomics approach: similar to other exome capture companies Directed Genomics also uses a probe hybridisation to targeted regions and/or exons, but applies this in a very different manner than we’re used to with standard exome capture. Two methods are presented in their recent posters; the first uses two probes, one at each end of the exon; the second uses a single probe hyb and random 5’end to create molecularly identifiable libraries. Current plans appear to be for custom panels, but hopefully they'll to build out to a whole exome panel over time.

Directed Genomics workflows

1: In their dual-probe method a short 50bp biotinyated-oligo probe is hybridised to fragmented gDNA at the 3’ end of an exon, the sequence upstream of this is then enzymatically digested and the 3’ hairpin adapter ligated. Next a second 50bp probe is hybridised to the 5’ end of the exon, the 5’ end is blunted and a 5’ adapter is ligated. Rather cleverly the hairpin adaptor ligated at the 3' end of the target links the target to the probe, allowing for a heat step in the second probe hybridisation without losing the target. Finally the 3’ hairpin is cleaved releasing products for PCR amplification and sequencing that contain only targeted exonic sequences. On-target rates of 97% were reported in their AGBT poster.

2: In their single-probe method a short 50bp probe is hybridised to fragmented gDNA at the 3’ end of an exon, the sequence upstream of this is then enzymatically digested and the 3’ adapter ligated. The probes is then extended to create the complementary strand and a 5’ adapter is ligated to the blunt end. This creates a library with random 5’ ends enabling a duplicate filtering step, unlike PCR approaches.

The protocols are both same-day 6-8 hours with around 1.5 hours hands-on time (according to the posters). Both allow a certain amount of, or all of the off-target sequence to be removed, reducing the amount of sequencing wasted. However the variation in exon length means that some sequence is inevitably lost.

Molecular IDs in cell free DNA: Their single-probe method creates libraries with in-built molecular ID. The random nature of the 5’ end should allow removal of all PCR duplication, without affecting biological duplication too much. Adding a  molecular identifier to the 3’ probe would increase this even further; and also bring molecular ID to the of the dual-probe method.

These molecular ID’s are likely to become increasingly important in methods to call low-frequency mutations in cell-free DNA applications, particularly ctDNA. Current methods make use of deep-sequencing to call mutations just below 1% MAF (mutant allele freq). However simply sequencing deeper may not be enough to get under 0.1%. A MAF of 0.1% would require sequencing to >10,000x to have enough mutant allele reads; and PCR, clustering and sequencing errors all make the detection harder.

Adding a molecular identifier should allow us to develop better statistical methods to call lower and lower MAF. Ultimately we aim to get to a point where we are restricted more by the presence of mutant alleles in a sample than by the technology used to capture and sequence them.

Directed Genomics and cell free DNA: The AGBT poster contained results from the Horizon Diagnostics Multiplex Reference Standard (link). Correlations of observed vs expected allele frequencies were >0.91. This is one of the first methods that can target mutant alleles with a single oligo, as compared to the two used for PCR amplicon sequencing, e.g. TAM-seq. It should mean an increase in sensitivity as more ctDNA molecules can be captured and amplified.

Directed Genomics expects to be launching later in 2015.

Thursday, 12 March 2015

Combining high-throughput CRISPR with in silico cancer drug development

In my last post I wrote about a computational screen of TCGA data and its use in repurposing approved drugs and/or finding new drug candidates for cancer patients. The work demonstrated the possibilities for finding novel treatments, but I also pointed to a cautionary Vemurafenib study that showed poor performance repurposing the drug in Colorectal cancer. As it becomes easier to identify novel therapies in a high-throughput manner, we need to develop methods to test these the are equally high-throughput. CRISPR knock-out or mutation of cancer drivers in multiple cancer cell lines or in tumour xeongrafts is one possibility - but most groups have carried out only a handful of knock-out or genome editing experiments.

Tuesday, 10 March 2015

In silico prescription of cancer drugs is likely to benefit patients - and can only get better

A fantastic paper just out in Cancer Cell: In Silico Prescription of Anticancer Drugs to Cohorts of 28 Tumor Types Reveals Targeting Opportunities from Nuria Lopez-Bigas's BioMedical Genomics lab in Barcelona.

They have developed an in silico prescription strategy by identifying the driver events in TCGA data, collating data on therapeutic drugs that target driver genes, and connecting  patients with driver mutations to potential therapies (see figure 1 from their paper below).
  • 40% (1635) of patients benefit from in silico prescription and repurposing of FDA-approved drug
  • 33.1% (1346) of additional patients benefit from in silico prescription and repurposing of drugs currently in clinical trials
  • 39% of patients could benefit from novel combination therapies
Figure 1 from Rubio-Perez et al (Cancer Cell 2015)
To identifying the driving events they took data from 28 cancers studied as part of TCGA and analysed all somatic SNVs, InDels, CNVs, fusions and RNA-seq differential gene expression. They found well over 400 genes that drive tumorigenesis via mutations, CNAs or gene fusions (data available at IntOGen; Gonzalez-Perez et al., 2013a). Many of these driver events are in loss-of-function mutations that could be druggable, are present in many samples, and are not in well-established cancer genes. 25 driver events occur in at least 5% of tumours of at least one cancer type.

Understanding cancer biology is vital: Whilst exciting the results presented need to be taken with a pinch of salt, and one worries about the headlines journalists might be using in non-scientific media! 

ICGC/TCGA and other NGS-based cancer projects have discovered new insights into cancer biology. However the results need very careful evaluation (and clinical trials) before their impact can be stated, and, at least in the case of Vemurafenib, targeted therapy can fail when applied to a different cancer. Vemurafenib increases survival in up to 80% of melanoma patients with the BRAF V600E mutation (although many patients develop resistance). But when prescribed to BRAF V600E positive colorectal cancer patients only 5% responded (see Prahallad et al in Nature 2012). This was reported as being due to activation of EGFR, by inhibition of BRAF V600E, driving continued cell proliferation. EGFR is expressed at low-levels in melanoma so the feedback activation is not significant. examples like this, and we can expect more to be reported, demonstrate the need to understand cancer biology with respect to targeted therapeutics.

The future for this kind of analysis looks bright: ICGC/TCGA data sets are getting larger and richer, analysis algorithms continue to improve, data on basket trials is starting to be reported, companies like Foundation Medicine are developing tests to report this kind of result. Hopefully this kind of analysis will be routine in just a few years time.

Friday, 6 March 2015

10X Genomics: what's the fuss over phasing

At AGBT 2015 the big splash was clearly 10X Genomics and their new technology the GemCode "toaster"; presumably so called because of its diminutive size, and not because your microtitre plate is launched out the top nice and warm! The system is available to order now costing $75K, with a $500 per sample price. Using an input of just 1ng means users can test this even with precious clinical samples. Hopefully the improved structural variant detection 10X are promising will have a significant impact on cancer research, perhaps making translocation   discovery easier.

Saturday, 28 February 2015

AGBT 2015 review...sort of

Another fun year at AGBT in sunny Marco Island. There has been some good blog coverage this year, but not from me; I've been Tweeting instead @CIgenomics on #AGBT15, see what happened on Twitter.

The UCSC Twitter track
Blog coverage:
GenomeWeb were there for those who could not make it - thanks team GenomeWeb, see Here, and here
OmicsOmics: Keith had several posts (always worth reading)
Genohub: day1, day2, day3 
NextGenSeq: has a preview post and covers Craig Venter's and Gene Meyer's talks.
DeciBio: had a piece on the rumours of AGBT15
Pacific Biosciences: day1, day2, their workshop
Illumina: covered their NeoPrep launch, but not their customers presentations
Ion Torrent: had several posts by Dale Yazuki (scooterdog), e.g. here.

I had a great time some of my favourite talks were:
  • Yaniv Erlich, New York Genome/Columbia University "Dissecting the Genetic Architecture of Longevity Using Massive-Scale Crowd-Sourced Genealogy"
  • Maryke Appel, KAPA Biosystems "The Evolution of Library Prep: Selecting for Both Quality and Speed"
  • Roman Yelensky, Foundation Medicine "A Novel Next Generation Sequencing (NGS)-Based Companion Diagnostic Predicts Response to the PARP Inhibitor Rucaparib in Ovarian Cancer"
  • Michael Fischback, University of California, San Francisco "Insights From a Global View of Secondary Metabolism: Small Molecules From the Human Microbiota"
  • David Epstein, ProPublica ProPublica, "Dangerous Dichotomies: How Sports Harm Genetic Research"
  • Michael Evan Macosko, Harvard Medical School "DropSeq: A Droplet-Based Technology for Single-Cell mRNA-Seq Analysis on a Massive Scale"
  • Christopher Mason, Weill Cornell Medical College "City-Scale DNA Dynamics, Disease Surveillance, and Metagenomics Profiling"
  • Iain Macaulay, Wellcome Trust Sanger Institute "G&T-seq: Separation and Parallel Sequencing of the Genomes and Transcriptomes of Single Cells"
  • Tatiana Moroz, University of Florida "Space Genomics: Epigenomic Mechanisms for Adaptations to Microgravity"
 10X genomics: And 10X genomics of course...


Wednesday, 25 February 2015

AGBT 2015

The meeting kicks off today with Illumina's user meeting and I'm going to give "Tweeting the meeting" a go this year. Keep an eye on @CIgenomics and #AGBT15

Thursday, 12 February 2015

What's coming up at AGBT 2015

We'll soon be back in sunny Florida (current forecast is low to mid 20's) for another cram packed Advances in Genome & Biotechnology meeting. Of course the focus is still on sequencing, but with instruments like to CyTof coming along and huge improvements in total proteome analysis, how long the Genome stay top of the heap is not clear.

Again the agenda is very full, and full of interesting stuff too! The standout presentation title has to be Saturdays talk at, 11:40–12:00 by Tatiana Moroz (University of Florida): Space Genomics: Epigenomic Mechanisms for Adaptations to Microgravity; other highlights for me are:

Thursday, February 26th: 
7:30 p.m. – 7:50 p.m.Hie Lim Kim, Nanyang Technological University Khoisan Hunter-Gatherers Have Been the Largest Population Throughout Most of Modern Human Demographic History.
7:50 p.m. – 8:10 p.m.Karyn Meltz Steinberg, The Genome Institute at Washington University FinMetSeq: Exome Sequencing of 20,000 Finns Identity Known and Novel Associations With Cardiometabolic Traits
8:50 p.m. - 9:10 p.m.Richard Leggett, The Genome Analysis Centre Towards Real-Time Surveillance Approaches Using Nanopore Sequencers
9:10 p.m. – 9:30 p.m.Nicholas Navin, MD Anderson Cancer Center Single Cell Sequencing Identifies Clonal Stasis and Punctuated Copy Number Evolution in Triple-Negative Breast Cancer Patients

Friday, February 27th:
11:00 a.m. – 11:20 a.m.* Stephen Kingsmore, Children’s Mercy Kansas City Newborn Sequencing in Genomic Medicine and Public Health: Rapid Genome Sequencing for Genetic Disease Diagnosis in Neonatal Intensive Care Units 
8:30 p.m. – 8:50 p.m.Lia Chappell, Wellcome Trust Sanger Institute Revealing Malaria Parasite Transcriptomes Using Directional, Amplification-Free RNA-seq
8:50 p.m. – 9:10 p.m.Roman Yelensky, Foundation Medicine A Novel Next Generation Sequencing (NGS)-Based Companion Diagnostic Predicts Response to the PARP Inhibitor Rucaparib in Ovarian Cancer

See you there.

Tuesday, 10 February 2015

What can you read if you're new to NGS?

A recent BioTechniues practical guide: Libraryconstruction for next-generation sequencing: Overviews and challenges, written by Steven Head et al from the The ScrippsResearch Institute NGS and Microarray Core Facility; might be just the thing to hand out to your NGS newbies as it covers pretty much every aspect of library they need to know!

The focus is very much on the options and challenges users face when making decisions about how to make NGS libraries. The article has sections on fragmentation; DNA library prep; RNA-seq library prep; complexity, bias and batch effects; target capture/amplification; mate-pair library prep; ChIP-seq library prep; RIP-seq/CLIP-seq and finally Methylation sequencing.

DNA fragmentation: This section covers all the major options, physical or enzymatic and discusses the importance of making sure the insert size is correct for your application. An interesting comment was that the lab has successfully clustered and sequenced libraries with 1500bp inserts!

Complexity, bias, and batch effects: The section discusses the importance of understanding the biases in the methods being used and of good experimental design. The use of duplicate reads to measure library complexity covers the important points of read-depth and sampling error. And the authors present the basic methods as applied to genomes where nucleic acids are present in roughly equimolar ratios; and discuss the caveats of applying the same methods to RNA-seq or ChIP-seq, where they most certainly are not. There is only a brief mention of molecular indexing and the potential impact this has on NGS analysis. The first take home message is to minimise batch effects and PCR – but there is no discussion about the quantification of your final libraries being a good place to make decisions on reducing PCR cycles. The second is that user should “keep in mind the general principle that more starting material means less amplification and thus [usually] better library complexity”, just because a kit can work with 50ng or RNA does not mean you should use 50ng when you can easily get 500ng!

Target capture/amplification: This section covers the major methods for in-solution target capture. But it is a bit light on amplicon sequencing methods, there are many companies out there selling amplicon kits and I’d suggest people look at the Fluidigm Access Array and Wafergen SmartChip for amplicons on Illumina, or even combine Ampliseq with Nextera XT.

Single-cells: The review only briefly mentions single cells and the Fluidigm C1 system. I suspect we will not have long to wait before we start to see similar papers focused on single-cell NGS.

“When in doubt, consulting a statistician during the experimental design process can save an enormous amount of wasted money and time.”