Thursday, 31 December 2015

Christmas in the Core and a New Years perspective on NGS genomics

Christmas came a little early to my lab with the delivery of two new HiSeq 4000 instruments. It would have been nice if these had come gift wrapped as it is the Christmas season, but alas no. Perhaps next year Illumina can get a little more into the Christmas spirit? I recently contributed to a 2015 review by GenomeWeb and thought I'd give a more in depth view of 2015, and what might be coming in 2016, from my perspective.

Looking precarious, but ready to move into the lab!
In their new home - until 2018?

We've bought two to cope with the very large amount of sequencing coming from a larger and larger collaboration. Over the next six months we'll migrate most of our work over to HiSeq 4000 and keep the 2500s in reserve for those more tricky library types. And we'll probably keep one for rapid runs unless the rapid 4000 chemistry gets released in January?

We've just completed our largest RNA-seq project to date: 528 TruSeq mRNA-seq libraries (we've stopped using index D703 due to some problems), 60 lanes of SE50bp sequencing for a total of 12,918,018,345 PF reads for this project (24M reads per sample on average), all in 16 weeks. With around 300M reads per lane (our install runs achieved 350M reads) versus the 200-250 on HiSeq 2500 we'd reduce the number of lanes by as 40% - big savings on RNA-seq. We've also been running larger and larger exome projects and the extra reads are going to make a similarly big difference to the cost of exome-seq.

2015's Core Genomics NGS review: There was a drop in price per base with the introduction of HiSeq 4000, however I'm not sure the PE150 is quite good enough at the end of the reads*. This is most likely to affect users wanting to sequence through 200-300bp regions, but for genomes and exomes we'll probably stick to PE125 for now. What will happen if Illumina's economic performance continues to hit speedbumps like they did in Q3? Will HiSeq 4000 get rebadged as the X-One, or will X-Ten get turned on for RNA-seq and exomes? I'm hoping for the former as the latter would push the democratised sequencer's back into the hands of a few larger labs. Will MiSeq X be launched in 2016? Will NeoPrep finally make a splash?

In September Ion Torrent launched their new S5 system and I wrote about whether it would be a hit. It was one of The Scientists Top Ten innovations of 2015, read their article to see the others. I've not heard any more about the new system since the launch, and I did not get a trade in on our old HiSeq 2000. In the GenomeWeb 2015 round up Stephane Budel (DeciBio) was quoted saying Thermo had lost market share due to the PII chip delays, but is aiming for 17% growth in 2016. How S5 will compete against Illumina is perhaps more about who is unwilling to use Illumina; the release of Qiagen's GeneReader platform might also be strong competition in this space (although my initial thoughts were not so great, I think Qiagen could make GeneReader a successful platform).

There has been lots of hand-wringing over the demise of Complete Genomics. I have never looked properly at data myself; heard some good things, lots more that was negative, but CGI never looked to be something that could be made into a machine that would compete with Illumina's HiSeqs. Certainly not as an instrument that could be put into a lab other than CGIs facilities. I'm not so surprised that Revolocity has been killed off, I was very surprised some people had gone public with a purchase and can only assume they got an amazing price per genome. I guess CGIs loss is Illumina's gain - someone has to fill all those X-Tens!

X-Ten: I'm not sure how many X-Ten instruments are out there but AllSeq has 16 systems listed on their website (although they are missing Edinburgh Genomics 15 machines, and the Illumina GEL lab). At around 18,000 genomes per year per system the world capacity is probably over 360,000 genomes - that's a lot in anyones book. About half of the systems on AllSeq are "currently accepting samples", but the others are "unconfirmed". Most X-Ten labs do not appear to be running at capacity and getting samples through the door seems to be the challenge. Can people afford $1000 genomes if they need to buy 18,000? Are funding agencies still willing to stump up for such large hypothesis-free data generation projects? We're starting a non-human genome project and I'm pretty confident that "other" genomes are likely to make use of the spare capacity. But will there be enough to fill the void? Illumina have invested heavily in the $1000 genome, but in 2015 X-Ten has looked a bit like the 300G upgrade to HiSeq - difficult for users to get used to. I'm pretty confident that the human population studes will start to pump out data in 2016 and the X-Ten will settle into its stride.

Long reads from Pacific BioSciences and 10X in 2016: Personally I'd like to see a SEQUEL in the lab by the end of the year but we'll have to put a strong proposal together to get it funded. I described the main features in October just after the launch and am still mostly interested in RNA-seq and structural variation, however pairing the SEQUEL with Fludigm's Juno AccessArray and a long-range PCR might allow us to build gene sequencing panels that also give us phasing - particularly important for tumour-supressor and DNA-repair genes. I hope they can get the active loading to work; this uses a bead to pull down DNA to the wells for sequencing and was described in some detail at a meeting organised by Mike Quail and I at the Sanger about two or three years ago. Toby Ost (formerly PacBio now at Shankar Balasubramanian's new company CEGX) went through the details and it sounded like this was just around the corner. However it is not an out-of-the-box feature of SEQUEL and is something we'll have to wait for.

We'll be running our first 10X genomes in the New Year, on our shiny new HiSeq 4000s. We're starting with two small projects and I expect we'll increase the numbers through the year. Ideally we'll bring a machine into the lab, but we'll look carefully at how the long-reads and  phasing information can add information to our experiments. The GemCode platform really is very exciting and looks to be able to give us additional information particularly around structural variation so important in cancer.

Single-cell sequencing: I expect  single-cell sequencing to be our big drive in 2016. Lots of Fludigm C1 experiments are planned, we'll be evaluating other technologies too such as Drop-Seq and CEL-seq, the DR-seq and/or G&T-seq DNA & RNA sequencing, and other methods. The technologies we'll ultimately use is still very much under review, whilst the C1 is easy to use, flow sorting and low-volume pipetting offer us real flexibility in cell-type and methodology. I suspect we'll end up running both.

ONT and Illumina's Nanopore: 2015 has been the year ONT started to walk, the MinION is working in many labs across the world, publications are coming that show how useful the system is, and new methods are being developed that would be impossible on anything else. I recently wrote an article for BiteSizeBio about ONTs updates, including minoTaur from Matt Loose - a real-time analysis for MinION data.

In the recent GenomeWeb article Mick Watson was asked whether ONT will overtake PacBio he said "PacBio has a customer base and a proven technology... but Oxford's technology has the potential to blow everything out of the water." I'd certainly agree with his enthusiasm and excitment, 2016 is likely to be the year ONT start to run rather than walk. AGBT 2016 is likely to be awash with MinION posters and talks, so expect another update immediately after the meeting.

2016's Core Genomics NGS predictions: here's my predictions for 2016...
  • More focus on the methods - things that enable experiments not just more sequencing
  • A MinION human genome
  • Rapid-runs on HiSeq 4000
  • MiSeq with patterned flowcells or 2-colour SBS
  • A working PII chip (sorry Ion Torrent, couldn't resist)
  • Core Genomics passing the 1 million readers mark in April!
Enjoy your New Year celebrations, see you in 2016.

*Illumina's spec is based on %Q30 but this can mean >10% error at the end of a read. More on this, and what it might mean for different experiments in a later post.

1 comment:

  1. In answer to your HiSeq X Ten question, Illumina has stated that they've got an install base of 300 HiSeq X's. We have many (but not all) X owners in our newly launched list of sequencing providers. You can find the HiSeq X providers here: