Thursday, 22 September 2011

Resequencing cancer gene loci

There is an enormous demand for resequencing of specific cancer loci. BRCA1 & 2, TP53, PTEN, KRAS, are already being Sanger sequenced in many labs but this process is not scalable to all cancer patients’ tumours. Some NHS hospitals do test cancer patients tumours for mutations. However they only test for a few specific genes or exons at a time and this is often done on a few samples at a time, rather than 100s or 1000s.

There are several methods available to capture Cancer specific loci and choosing between them is not simply a matter of cost. Other things to consider are the amount of material available for analysis, time taken to generate libraries and data, the amount of on/off-target sequence, ...

I thought I would briefly review some of the available methods in this post, partly so I don't have to explain this each time some asks at work!

In-solution 'exome' style methods: Illumina just released their TruSeq custom capture kits and there is a Cancer panel available now. Coming soon are autoimmune and ES cell gene panels. The kit allows capture of 700kb-15Mb of sequence in a 3.5 day protocol which uses 2 capture hybs rather than the one long hyb that Agilent and Nimblegen suggest. You have to make a TruSeq library from 1ug input DNA and there is a cost to this of about £30-60 depending on how many libraries you are making. Agilent and Nimblegen both offer customisable capture methods as well. Flexgen also offer a custom capture product and there is a Flexgen calculator you can use this to get costings for a project. Update! These use broadly similar protocols and a few comparisons have been published, an article was publised in Nature Biotech and is reviewed here.

The Cancer panel targets 372 genes across 5265 exons and is available on the Illumina website. Up to 12 samples can be pooled for custom capture and run as multiplexed sequencing. It looks like 96 samples could be prepared using gel free methods and sequenced in less than two weeks. The cost per sample also looks like it is going to be very attractive.

I can't help but wonder if you could reduce capture costs by 50% with a single hyb but improve data quality using 2 replicates. It might be possible to get away with one hyb as a quick and dirty screen but I am sure Illumina will not recommend this.

Pros: A large number of genes can be targeted.
Cons: You need to make a library first.

Multiplex PCR, or PCR-like methods: Molecular inversion probes, HaloGenomics "Selector" probes allow capture of specific regions from genomic DNA. Illumina are releasing their TruSeq custom amplicon product next month when MiSeq is finally delivered (AGBT seems a long time ago). This will use Illumina's GoldenGate assay to target up to 384 loci (I think) in a multiplex reaction. There are very few details available from the Illumina website but the kit is likely to use the same locus specific extension/ligation assay followed by PCR with tag sequences complementary to the flowcell primers. It is likely to be supplied in 96well plate configuration and would allow very large numbers of samples to be screened.

Pros: No need to make sequencing libraries. Very automatable workflows, low amounts of DNA are required.
Cons: A smallish number of genes can be targeted.

Next-generation PCR methods: The Fluidigm Access Array is a microfluidic chip that performs PCR on 48 DNA samples across 48-480 assays in single-plex or ten-plex reactions. The system is easy to use and can be set up by anyone that has previously pipetted into 384 well plates. Update! Costs for Fluidigm are about £200 for an AccessArray plus primer and reagents costs of £50 (dependent on the number of assays you intend to run), a cost per sample for 48 amplicons of £5. Combine with MiSeq PE150 kit at £600 for just £17 per sample including sequencing! The first Access Array paper was published in BioTechniques this week, a team from Roche and Fluidigm resequenced EGFR and MET Access Array System on GS Junior.

The RainDance system allows PCR amplification in a bead emulsion of up to 20,000 loci. Their new system will process up to 96 samples (called Thunder Storm, Hurricane or something like that). Update! GenomeWeb has a report from Ambry genetics about their experiences with RainStorm. They also report on the ThunderStorm instrument which will process 96 samples per day. It will "complete  a targeted resequencing protocol in about 15 minutes with no hands-on time, and will enable the company to offer sequence enrichment at between $100 and $150 per sample". This suggests Thunderstorm will run for 24 hours and take 15 minutes per sample.

Both systems require specific hardware. RainDance requires more DNA than Fluidigm which only needs 50-100ng per sample. Fluidigm also processes 48 samples with standard oligos and these panels can be reconfigured very easily by users in their own labs while RainDance amplicon panels need to be made to order. The separation into individual reaction chambers of both of these platforms may make them more suited to discovery of rare mutations where in a traditional multiplex PCR these might be out competed by more common alleles.

Pros: No need to make sequencing libraries. Automatable workflows, low amounts of DNA are required.
Cons: A smallish number of genes can be targeted.

Which one to use? Personally I'm most interested in Illumina custom amplicon and Fluidigm Access Array. Both of these allow very fast preparation of large numbers of samples as sequence ready libraries directly from DNA. They could be used to screen every sample or cell line coming into a lab. There is a lot of competition in this space so I am also hoping the cost continues to drop, DNA input requirements fall and the number of amplicons targeted increases.

I'll update this post from time to time as I get a minute to add more detailed information. Feel free to comment on what I may have missed or ask questions.


  1. "These use broadly similar protocols and a few comparisons have been published."

    Do you have a link of one or more publications please?

  2. A recent post entitled "Foundation Medicine grabs for the low-hanging fruit of NGS cancer diagnostics" on Stuart Brown's "Next-Gen Sequencing" is relevant to this discussion:

  3. Lung cancer is among the most common cancers in the Western world. Lung cancer occurs due to the growth of malignant or abnormal cells in the lung. It is the third most common cancer in males and the fifth in females. Yet lung cancer is increasingly becoming a woman's problem. The risk for dying of lung cancer is 20 times higher among women who smoke two or more packs of cigarettes per day than among women who do not smoke at all.

    Flora Synergy Ostaderm