I was asked if I'd like to play with Life Tech's new AmpliSeq design tool before general release and below you'll find out what I thought of it. All-in-all it looks good, is simple to use and the results include primer sequences.
Primer design history: many of you may well have sat down and tried to design primers by eye in a 1000bp stretch of DNA. I know my first experiences back in my undergraduate third year were not great but there were no tools available at the time.
The huge change for me was the release of Primer3, it was an update of a previously unreleased programme called simply Primer written by Mark Daly and Steve Lincoln in Eric Lander's group in the very early '90s. Primer3 was released in 2000, published in the Methods in Molecular Biology series. It is the basis of most primer design software I have ever used. But I don't know if it is used in the AmpliSeq designer.
What is AmpliSeq: AmpliSeq is LifeTechnologie's amplicon sequencing application. It allows up to 1536 amplicons in a single-tube PCR based assay. It is a highly-multiplexed PCR system rather than a droplet, microfluidic, hybridisation or Extension:Ligation based system. Ampliseq requires as little as 10ng of DNA and produces 200 or 150bp amplicons which should amplify FFPE samples very well.
AmpliSeq Designer: LifeTech are about to release a new product for custom panel creation the AmpliSeq Designer. This tool allows creation of custom AmpliSeq panels based on your input of genes (in multiple formats). They say once designed a panel can be ordered and delivered in weeks. Simply log into your Ion Community account, agree to a license (I'd suggest reading it as you don't want your gene list being scanned for interesting panels without getting something back yourself) and get designing!
I uploaded a list of 23 Cosmic genes (here is my list) as gene names and only got a couple of easily fixed errors. You can choose between 150bp and 200bp amplicons. There is currently a 5bp "padding" at the ends of exons but this will become user tuneable later on. Obviously you would not want the primers to be in the exons or you would ever "sequence" the ends, rather just resequence your oligos.
This will be the subject of another post soon to be written about choosing oligo providers.
My first design consisted of 23 target genes, 124KB of sequence, 1081 amplicons and 90.89% coverage of targeted bases.
After hitting "submit targets" I got a message an an automated email saying the results would be emailed soon. They suggest 48 hours for design but I got mine back in just under 4 hours. The results download included four files (available as a zip file), a Bed file, a data sheet, and coverage details and summary files. It looks like there is a very strong correlation between the exon length and the final coverage, with short exons being almost always successfully targeted but longer ones being less so.
There was an odd discrepancy between the number of ampiconsl reported in the web page (1081) and the summary files (1143) however the data sheet containing primer sequences has 1081 pairs. I guess a few targets have droped out completely and primer sequences are never reported.
Primer sequences: A big congratulations to LifeTech for releasing these back to users. However as I did not read the license in any detail there may be restrictions on how these sequences are used.
There is a note in the data sheet file saying that Ampliseq primer sequences contain proprietary modifications. Whether this makes them unsuitable for other applications (I am immediately thinking of many) is not clear but I am sure some users will give it a go. I'd certainly be calling for the other companies to release sequences rather than just send back panels of amplicons.
Barcoding: From the Ampliseq data sheet it looks like only 32 barcodes are available at this time. It is also probable that these are single barcodes at one end of molecule. I very much favour the two barcode strategy where the combination allows fewer primers to be used in a secondary PCR against tag sequences. Just 16 forward and 24 reverse oligos allow a single 384well plate to be amplified and sequenced in one sequencing run.
Of course on Ion Torrent 384 samples would not get the required coverage for 1536 amplicons. But the Proton should be able to work with numbers like these and getting the tools and kits built today will save changing things later on.
The BED file: Included in the zip file is a BED file of my panel. This was very simple to upload to UCSC and visualise coverage across the genes targeted. I have zoomed into TP53 for the example below.
|UCSC TP53 BED file|
Coverage of my targets: Only Kit achieved 100% coverage of all 5358 target bases in the design. Amplicons for FGFR3 and AKT1 were the worst with only 75% coverage.
It will be very important for amplicon-sequencing users to understand what bases they are likely to see in the final sequence data before they run an experiment on their samples. It would probably help if the community thought a bit more about what the important metrics to report are for this kind of panel. Obviously most user will want to get 100% of all target bases covered. But this is unlikely to be possible with most panels as they increase in size and we should consider how much stringency of coverage can be relaxed.
Coverage is reported for target regions as a percentage. This is the percentage of bases targeted by AmpliSeq and it would be important to monitor final coverage in case there is dropout of particular amplicons in the final sequence data.
One thing that might help in design tools like the AmpliSeq one, Agilent's HaloPlex Design Wizard, Fluidigm's Access Array designer and Illumina's TruSeq Custom Amplicon Design Studio is the ability for users to specify bases that must be covered. TP53 for instance has several hot-spots a user would not want to miss, whereas other Cancer genes are mutated more or less randomly along their sequence. It may be important to specify particular bases for TP53, but 85-95% coverage for non-hot-spot genes may be OK if large numbers of patients are being screened and coverage drop out is random. If coverage drop out is not random then careful review of the literature before finalising a design might be warranted.
Sequencing results: Unfortunately I can't report back on the quality of the final panel. My lab does not have an Ion Torrent, of course if LifeTech want to leave one in the lab for a few days with the reagents I'd happily give you all an update.
Other considerations: I'd like to see tools like this develop and ideally see an open source variant, perhaps in Galaxy? I am sure a lot of people would use such a tool so whoever puts it out can expect a good citation record! The things we are considering are flexibility in amplicon length rather than specific cutoffs, an ability to add tags for any sequencing platform, higher degrees of multiplexing, an ability to target SNPS by very short read sequencing(>10bp), etc.
Please feel free to comment on what you would like to see from Amplicon-sequencing tools.