Last week I posted on the press releases from Illumina and Life Technologies. I wanted to follow up in this post with some more details after talking to Illumina. I have to say that the details on the new flowcell are not directly from Illumina. I'll try to get a better image than my drawing below and some confirmation, then I'll follow up.
I’ll kick off with a little HiStory (excuse the pun).
HiSeq Jan 2010-Jan 2012: The release of HiSeq in January 2010 made a big splash in labs like mine, it meant we could start to run whole genome sequencing projects and not just leave these to the big boys (Sanger, Broad, WashU, etc). Instrument capacity had become something of a hindrance to our aspirations.
The leap from GAIIX and 25-30GB to HiSeq's 100GB per flowcell coupled with a decrease in run time (14 down to 10 days) effectively meant we could aim to sequence 80 genomes a year at 30 fold coverage. The cost per genome also dropped precipitously from around $35,000 to $9,000. (see the bottom of this page for my calculations).
The v3 chemistry released last Summer which increased data volumes to 300GB per flowcell brought us another 3x drop in costs and we can now sequence almost 250 genomes per year at 30x coverage on a single HiSeq. With a single genome costing around $3000 in reagents.
Illumina do appear to be holding back on releasing improvements to allow 1TB runs. These were presented at AGBT '11 using v3 chemistry and PE150bp runs. I also saw a slide late last year at a MiSeq road show where 1TB had been completed in 10 days using PE100 and more clusters. 1TB runs would bring the cost per genome down to around $1500 per 30x genome.
HiSeq 2500, what’s new: The HiSeq 2500 is available as either a new instrument or an upgrade to an existing HiSeq 2000 (more on that later). It will support both the current v3 chemistry delivering 600Gb per run, as well as the new “MiSeq on steroids” reagent format which uses a new flow cell that is clustered onboard producing a 120GB output and the much vaunted “genome in a day”.
The “genome in a day” runs will most likely cost around 3-5 times more per genome than a 600Gb run that currently works out at about $30 per GB. The information so far released is suggestive of a reagent cartridge for the “MiSeq on steroids” runs, which would include cluster generation, sequencing and cluster regeneration reagents. Running the current v3 chemistry on HiSeq2500 will still require the use of a cBot.
Instrument changes include alterations to the flowcell stage, fluidics, software and other minor modifications. Illumina have said that HiSeq 2500 and upgraded HiSeq2000 systems recently purchased will be able to run more quickly than older upgraded HiSeq 2000 systems. Those of us that were early into the switch from GA to HiSeq will lose out in this but the additional run time is likely to be less than half a day, so I can’t see a major impact on my lab. Of course a “genome in a day and a half” is not nearly so nice a marketing message.
The new 2500 flowcell: I did get a little information on the new flowcell. Apparently it has two lanes, coupled to four ports at either end. This allows more reagent and possibly faster sequencing chemistry to keep quality high but increase cycle-times. The two lanes format also keeps imaging time low. This does mean that if users want to do high coverage amplicon sequencing projects on the new format then they will need to multiplex to 96+ samples per lane to get the best performance out of the run. If you want while Human genomes quickly though it should delivery nicely.
However the HiSeq 2000-2500 upgrade is substantialy cheaper than even a single Proton sequencer and should keep HiSeq competitive as far as output in 24 hours goes. For labs that have HiSeq already the excitment to try a new technolgy will be balanced in many core director's heads buy the difficulty in running two very different systems side-by-side. I expect many HiSeq labs to upgrade rather than buy Proton. It will be interesting to see what percentage of HiSeq instruments in the big genome centres get upgraded, or if smaller labs upgrade all their HiSeq's where they only have two or three.
An issue many facilities may face is funding in our current economic climate. $50,000 of upgrade buys a couple of very nice sequencing projects and possibly some publications. Things appear to be very tight for almost everyone when it comes to capital expenditure. I don’t expect this upgrade to be a simple “I’ll take one, when can you deliver”. And I’m certainly expecting to compete for the funds with other projects.
Do people need lots of genomes so rapidly? This is a question still to be answered in most labs. Whilst I am sure many users would immediately say yes to the decrease in turnaround they probably would not choose to take advantage too often. In a research setting 9 days difference between 120GB and 600GB will almost always lose out to the very significant cost differential. Most users can simply wait a little longer for their data.
I am sure many other core facility managers have been asked to rush samples only to find out the data sat around for a couple of weeks to get analysed. I even offered a “platinum” service for Affymetrix array processing in 2003/4 that promised data in 3 days from delivery of RNA samples. It cost twice as much as normal and no one was willing to pay to jump the queue.
Clinical amplicons, exomes and/or genomes may well require a quick turnaround but there don’t seem to be so many centres that will make use of the very fast, but more expensive whole genomes. Fast amplicons and exomes are likely to be transformative in the next couple of years, but is an additional 9 days too long to wait for Genomes? This will certainly be interesting to watch.
One email I received after my last post said I may have to eat my words “who realistically needs their genome back in 24 hours, not a lot of users!”. If I do have to eat them I prefer battered with chips and lots of ketchup!
PS: All calculations assume perfect cluster density on all runs and no instrument down time.