Our MiSeq has just been replaced after some performance issues and we are now getting ready to start running the upgraded version with the newest chemistry. Hopefully we'll be getting the 10Gb+ that Illumina reported some users are achieving.
In a GenomeWeb article yesterday Illumina responded to Life Technologies recent Proton updates (P1 80M, P2 4580M P3 1.2B sensors). MiSeq read lengths are increasing to 2x300bp and the number of reads will jump from 15-17M to 25M or more making 22Gb possible.
Way back in the Summer of 2011 I speculated that MiSeq would get to 25Gb by imaging more area and generating more reads (see MiSeq 1,), I have continued to follow the systems developments (see here, here and here) and am watching to see what else we might expect in the future. The MiSeq was sold as a box with a lot of potential locked down for easy upgrades and an 'Applesque' development path. Much of what has been made possible recently was probably available from launch, or at least considered by the designers.
Currently only one of the two "lanes" in the MiSeq is being imaged (imagine the return path as a separate lane). Further increases could come from bigger lanes or an upgrade to movable camera stage (unless there is one inside the box already). All pretty simple.
But how far does Illumina need to push MiSeq? With the 2500 rapid runs allowing the same degree of flexibility (on instrument clustering and 24 hour run times), any further increases to MiSeq make the GA dead-in-the-water. Perhaps Illumina will be driven by announcements by Life Technologies? Could Miseq ever deliver really fast run times like PGM or Proton? I'd say it does that already, a single-end 120bp run is completed in under 2 hours if you ignore clustering. Combined with Nextera genome prep you can get sample to genome in a working day.
So what should Illumina focus on next: My bet is that users will want the longest reads possible and we will learn to do new things with Sanger read lengths but tens of millions of them. Any de novo sequencing will be made much easier, structural variation is simplified and splicing becomes as trivial as differential gene expression by read counting (almost). And all of these could possibly be done with lower quality reads. Imagine a 100bp read with Q40 in the first and last 250bp, Q30 from 250-450 & 550-750 and Q10-20 in the middle?