Friday, 7 October 2011

Illumina Custom Capture: Design Studio review

Illumina are currently offering a demo kit for custom TruSeq capture. I thought I would try it out on some genes from COSMIC and see how easy it was to design a capture set using their DesignStudio tool. There is also a pricing calculator I was very interested in so we can see how much the final product is likely to cost.

The TruSeq Custom capture kit is an in-solution method that allows users to target 0.7-15 Mb of sequence. The design studio site will produce 2,500-67,000 custom oligos. After ordering you simply make lots of libraries with TruSeq DNA Sample Preparation Kits and perform the capture reactions in up to 12plex pool. This makes the process pretty efficient if you have lots of samples to screen. As the kits come in 24 or 96 reaction sizes a total of nearly 300-1200 samples can be processed together. With 24 indexes currently possible in TruSeq DNA kits this is just 2-4 lanes of sequencing. As Illumina move to 96 indexes for TruSeq kits the sequencing cost will continue to drop.

You need to register with Illumina for an iCom account, then you can just log-in to the "Design Studio" and get started.

Illumina DesignStudio: Start a project, choose the genome, upload loci and the tool does its job.

There are multiple ways to get your genes of interest into their database. I chose to upload a csv file with a list of the 100 most mutated genes in COSMIC. The template file Illumina provide is very minimal. There are columns for, gene name, offset bases, target type (exon or whole gene), density (standard or dense) and a user definable label. The upload was simple enough and in about thirty seconds all the genomic coordinates for the list of genes was available. The processing for bait design took a little longer at about ten minutes.

The design tool predicts coverage and gives a quality score (this is a cumulative score for the entire region targeted) for the targeting. Each probe set is shown in a browser and coloured green for OK or yellow for problematic. For my 100 genes 12 were under a 90% score and two were not designed at all because I entered names with additional characters making them incomprehensible by the tool.

Here is a screenshot of TP53 exon probes:

there is a pricing calculator available as well which I'll talk about in a minute.

My "100 most mutated COSMIC genes" custom capture kit:
It took about twenty minutes to pull this together
Regions of Interest Targeted: 2662
Final Attempted Probes Selected: 3693
Number of Gaps: 80
Total Gap Distance: 3,158
Non-redundant Design Footprint: 736,759
Design Redundancy: 3%
Percent Coverage: 100%
Estimated Success: ≥ 95%

How much will this kit cost?
The pricing calculator has some variable fields you need to fill in, it uses the data from my region list asks for library size (default 400), how many samples you want to run (288 minimum), what level of multiplexing (12 plex for this example) and then platform and read type. Lastly it asks you to select the % of bases covered in the regions and at what fold coverage, e.g. 95% of bases at 10 fold (in this example).

Unfortunately the calculator did not work!

Fortunately Illumina provide another one here and this did. This also recommends how many flowcells and what other items are needed to run your custom capture project.

It turns out that I will need:
My TruSeq Custom Enrichment Kit at $33,845.76 in this example.
6x TruSeq DNA Sample Prep Kit v2-Set A (48rxn with PCR) at $12k.
3x TruSeq SBS Kit v3 - HS (200-cycles) at $17k.
3x TruSeq PE Cluster Kit v3 - cBot - HS at $13k.
A total of $76000k or $263 per sample. 

Trying it out:
If you are interested in trialling this then Illumina are offering a 50% off promo on a 5000 oligo capture kit for up to 288 samples (24 pull downs at 12 plex). You can also order a $2860 TruSeq custom capture demo kit that targets to ~400 cancer genes (again from COSMIC I expect, there are also autoimmume and ES cell gene kits). The kit includes both TruSeq DNA library prep and the custom capture reagents for processing 48 samples. This works out at $60 per sample for pulldown. If this were run on two lanes of HiSeq v3 the cost including sequencing would still be under $100 per sample.

I am not sure how much sequence is going to be needed to get good coverage on this particular kit, but my kt is a little smaller than the Cancer trial kit so the results from the pricing calculator should be indicative. 

When will custom amplicon be released?
For me the most interesting thing Illumina mentioned in the MiSeq release at AGBT was the GoldenGate based custom amplicon product. Hopefully it will appear soon on DesignStudio as well. This will allow us to remove library prep almost entirely and process samples from much smaller amounts of DNA.

Illumina are also releasing dual barcoding in the next release of HCS. This will allow four reads from each library molecule so with just 24 barcodes you may be able to multiplex 576 samples into one MiSeq run with just six 96 well plate reactions to get almost ten fold coverage of all TP53 exons.

