Thursday, 23 May 2013

Illumina Scientific Summit "Cancer Genomics" breakout

This years Illumina Scientific Summit was a great meeting, lots of great talks and great people to talk to at dinner and the bar. I already posted about a collaboration Geoff Smith presented and this post is some rough notes on the "Cancer Genomics" breakout that took part today.

Jen Stone from Illumina introduced the discussion session and talked about the promise of cancer genomics; what does it really mean for patients and doctors? She talked about four areas of focus: germline risk; healthy screening; profiling, diagnosis and treatment; and response, recurrence and MRD. Jen reintroduced everyone to the different datasets for each of these areas: GWAS for germline risk; TCGA & ICGC for discovery around profiling, diagnosis and treatment; circulating tumour DNA and cells for response, recurrence and MRD; and a single publication for healthy screening Evaluation of DNA from the Papanicolaou test to detect ovarian and endometrial cancers.

Jen also talked about how Illumina hopes to realise the promise of cancer genomics by continuing to innovate in sample prep and sequencing. Partnering with groups on collaborative research. And developing an environment that is receptive to genomic medicine (Understand Your Genome).

She also presented some experimental challenges; should we use targeted or whole genome sequencing, if targeted then what (exome, amplicon, etc), should we analyse DNA, RNA or Methylome, CTCs or ctDNA etc, et, etc. Analysis methods need to detect not just SNPS but also InDels and everything else, one method almost certainly does not fit all! Other issues like LOH, Heterogeneity, low input, stromal contamination, limits of sensitivity all need to be addressed.

So what does the future of Cancer genomics look like: Illumina used a mobile app to collect votes from attendees in real time and used the answers as a starting point for discussions. The app worked really well. It is always a bit of a risk doing a live demo and I'm always nervous that it is going to be a trainwreck presentation; but this worked really well. Hopefully the results from the votes will also be available after the session for attendees to reflect on. If not I've picked some of my favourite questions and the answers from participants.

There was a discussion on whether the term "the $1000 genome" is a force for good or bad? Does it set an unrealistic price point, can the bioinformatics challenges be met so they remove a large part of today's costs of analysis and interpretation, is it mis-interpreted by the public and the media?

One attendee pointed to New York Times OpEd piece article comparing the $99 cost of a 23andMe test (possibly $1000) to the $4000 cost of BRCA1/2! 

The questions and answers:
Cancer sequencing will be the standard for all patients in the next: about 45% of participants said either 3-5 years or 5-10 years.

In order for sequencing to be part of the standard, what are they key drivers?  90% said all of the above (Price, clinical utility, data analysis, workflow) are required but when asked which is most important then 65% chose clinical utility.

All Tumor DNA Sequencing must be analysed in conjunction with a paired normal sample?  almost 50% said yes, 25% said maybe.

Large projects such as ICGC, TCGA and Standup2Cancer are all generating multiple datasets per sample (T/N, RNASeq, Meth450, etc.).
70% of respondents sad they thought that only the subset of variants from these huge discovery datasets will be tested clinically, when responding to the statement "I think these studies are setting the stage such that:".

What is an acceptable turn-around time for an optimal cancer dataset in the research space? 
2 weeks to 1 month was the majority choice. This is a tough challenge for core facilities like mine which are often capacity constrained. We need to work closely with our users to explain the limitations and set realistic expectations.

What is an acceptable turn-around time for an optimal cancer dataset in the clinical space?
95% said less than 2 weeks.

How important are long reads for cancer applications? only 25% said long reads were important. This is a topic close to my heart and I strongly believe long reads will be transformative for RNA-seq as isoforms can be directly counted. For DNA we should get more robust structural variation analysis. And both will hopefully be completed with fewer reads.

How important is phasing for cancer applications?  Most participants said "not very important", there was no "killer app" reported for cancer genomics.

We ran out of time to finish the questions and answers. All in all it was a very good breakout session. I certainly learned a lot from the discussions.



Wednesday, 22 May 2013

Moleculo data presented at Illumina Scientific Summit

Geoff Smith from Illumina presented a great piece of work his group have been collaborating on with Jill Banfield’s research group at the University of California Berkely, Department of Earth and Planetary sciences.

They have been using NGS to analyse the communities of microbes performing bioremediation in soil samples from a former uranium mill in Rifle,Colorado. These microbes were collected from a core drilled at the site and sampled at different depths. The site used to process uranium and rain carries dissolved heavy metals into the groundwater and the Colorado river. Microbes internalise the heavy metals making them insoluble and assayable in the sediment core.

Using NGS the team are investigating the diversity of the microbial community and hope to understand how bioremediation might be improved by modulating this community diversity. The bacterial community is “fed” with acetate under reducing conditions to stimulate uranium absorption.

The collaboration with Illumina involved preparation of Nextera libraries for standard Illumina sequencing to try and understand the metagenome of the community. At the same time they also prepared Moleculo libraries to see if the long reads from this technology would improve their discovery efforts.

Moleculo works by first fragmenting 1ug of genomic DNA to 10kb, adapters are ligated that allow a universal clonal lrPCR to take place in a reduced-representation fo the genome. These lrPCRs are then used as the input to a Nextera library prep, pooled and sequence. The sequences can then be demultiplexed to the Moleculo barcodes and used in a de novo assembly. Find out more about Moleculo here.

What does Moleculo bring to the table: Standard Nextera libraries showed a reasonably diverse community with many of the species expected. However then Moleculo data revealed that there were many species missed in the Nextera dataset many of the low abundance species are only found using Moleculo technology. The theory is that normal sequencing leaves lots of reads that align poorly to current genomes so appear to be insignificant. But the Moleculo technology makes sense of the data by allowing the de novo assembly of reads to build larger contigs that can be made sense of.

It will be interesting to see if the same increase in species abundance is seen in other metagenomic datasets using Moleculo. It’s not just cancer that benefits from long reads after all!

NGS forensics might require unique barcodes

The use of NGS in forensics is likely to supplant current STR profiling. One issue that has been discussed is the need to robustly determine that the correct sample has been analysed and reported on. With NGS almost all sequencing today is being done in a multiplex fashion using indexing barcode reads of generally around 6-10bp.

Current barcoding uses sets of up to 96 or 384 barcodes which is fine for many research applications. However there are reports of carry-over of barcodes from one run to another  on the kinds of instruments that are likely to be used in forensic labs. This is likely to make courts nervous about bringing this technology in.

One way to remove some of the problems would be to create significantly longer index oligos that are unique. I can see a market for a consumable kit where each is truly unique. The indexes included would be single-use only and no-one else in the world would get the same index oligo. Oligos can be made in very small synthesis scales today and in high-throughput. Given a long enough read even lower quality synthesis should allow confident discrimination of contamination allowing users to be confident about the final results.

A 25bp barcode read should be as unique as we might ever need and on HiSeq today would not add much time to a sequencing run.

Thursday, 16 May 2013

Using SPRI beads to improve Nextera

We have been running Nextera in my lab for a while now and had some great success e.g. 600x C. elegans genomes. With the release of Illumina's rapid exome kits Nextera is proving to be a versatile and useful addition to our lab tools. Unfortunately it has at least one significant weakness; the need to accurately quantify dilute DNA.

Nextera requires a precise ratio of DNA to transpososomes. Get this wrong and the libraries may not be great and might be unsequenceable.

The standard prep calls for Genomic DNA at 2.5ng/μl and the Nextera XT wants just 0.2ng/μl. We find that many users struggle to get this right for their samples. Most see an improvement if they use a good quantifiaction method and Illumina provides recommendations in the manuals. The use of a double-stranded DNA intercalating dye (we use QuBit) means single-stranded DNA, RNA and other contaminants don't confuse the readings. Nanodrop is a no no for Nextera!

Using SPRI beads to improve the protocols: I have been thinking how we might use a SPRI bead step at the start of a project to capture a very specific amount of DNA. Illumina are using SPRI for bead-based normalisation at the end of the process, so why not add it at the start?

The method is simple enough and would be much quicker for labs without fluorimetric plate readers who have to QT each sample one at a time.

We've not started testing a method yet but next time we're running a Nextera protocol we'll be giving it a go.

Wednesday, 15 May 2013

Automating Illumina library preparation

Illumina's just announced a partnership with major automation vendors so their s-bot looks like it has been killed as a project. We've got a Tecan Evo in our lab and would like to use it for library prep but a major headache has been the rapid development of library prep methods. There just does not seem time to program and validate a method before the next one comes out.

Monday, 13 May 2013

Make a DNA diffraction with a laser pointer

I came across this blog recently and thought you'd like to read it too.

Try this at home is a blog written by Mark Lorch, a chemistry lecturer at the University of Hull. He has some great stuff to try at home with the kids, check out his Knex DNA model. My son and I made the DNA replication fork with Okazaki fragments, Mark points to at Mount Sinai!

The DNA diffraction post tries to demonstrate the physics behind the data that allowed Watson and Crick to work out the famous molecule's structure. The X-ray diffraction image taken by Rosalind Franklin was crucial to the discovery. Read Watson's book for a history of the discovery, and his addendum to Rosalind Franklin.

Wednesday, 8 May 2013

New York Times piece on personalised medicine

The New York Times recently published an article on Foundation Medicine and their approach to personalised cancer genomics (targeted sequencing of 236 cancer genes). The test costs $5800 and is covered by some insurers in the USA. And the reports from the test have caused doctors to change patient therapy. 

So far so good. A personalised test a cancer patient might be able to afford (and they may not have to pay anyway). 

The article also refers to the My Cancer Genome web site, which offers a gene-by-gene portal into Cancer genomics. The site lists different diseases and points users to some of the literature supporting the impact of a particular gene on cancer. For examples a user can search for melanoma and BRAF to find mutations that might affect prognosis. At the same time they can check for clinical trials that might be ongoing.


Targeted sequencing is getting loser and closer to reality in the clinic. The technology we are using in my lab (and others like it) is becoming pretty robust. All that is needed is a good way to make doctors aware of what's possible and get the results into their hands in a way that means they can focus on medicine and not bioinformatics.

Wednesday, 24 April 2013

The blogs I read & M. leprae's magic tricks with stem cells

I have recently added several more blogs to the list on the side of this page, these are blogs I enjoy and I hope you find the posts on them as interesting as I do. Choosing to follow a blog always seems a bit of a big step. It adds to the burden of interesting things to read, as if we didn't have enough coming out of science journals, GenomeWeb and other sites. I try not to add anything I don't want to read long term as I feel terrible taking someone off my list, it feels like a personal attack!

Friday, 19 April 2013

A $1000 genome for a $400,000 drug

The NGS community has been talking about the $1000 genome as a holy grail (expect it in the second half of 2014), but on the way to this the exome is looking like it will become a cornerstone of modern medical research mainly because of the ease and costs of generating the data. The NIH launched the Undiagnosed Diseases Program in 2008 and in the first three years had sequenced around 250 exomes from case book of 1800 patients. One of the programs aims was to reduce the time taken to accurately diagnose patients, which for 15% of people was taking more than 5 years. The program attracted funding of $3.5 million a year (2010-2012) and has had some big successes. Read The NIH Undiagnosed Diseases Program: Lessons Learned to find out more.

Wednesday, 17 April 2013

The 8 rules of cake club

  • 1st RULE: Everyone votes at cake club.
  • 2nd RULE: EVERYONE votes at cake club.
  • 3rd RULE: If someone says "burnt" the cake is over.
  • 4th RULE: Only one person can bake.
  • 5th RULE: Only one cake at a time.
  • 6th RULE: No shop bought, no spouse-made.
  • 7th RULE: Cake will be eaten until it is gone.
  • 8th RULE: If this is your first time at cake club, you HAVE to bake.