Monday, 25 July 2016

RNA-seq advice from Illumina

This article was commissioned by Illumina Inc.

The most common NGS method we discuss in our weekly experimental design meeting is RNA-seq. Nearly all projects will use it at some point to delve deeply into hypothesis driven questions, or simply as a tool to go fishing for new biological insights. It is amazing how far a project can progress in just 30 minutes of discussion, methodology, replication, controls, analysis, and all sorts of bias get covered as we try to come up with an optimal design. However many users don't have the luxury of in-house Bioinformatics and/or Genomics core facilities so they have to work out the right sort of experiment to do for themselves. Fortunately people have been hard at work creating resources that can really help and most recently Illumina released an RNA-seq "Buyer’s Guide" with lots of helpful information....including how to keep costs down.

Illumina's "Buyer’s Guide": the guide offers advice on common RNA-Sequencing methods and should help new users in evaluating the many options available for next-generation sequencing of RNA. Anyone considering a differential gene expression analysis experiment should have RNA-seq as their platform of choice and the guide presents three simple steps for users to consider different aspects of their experiments.

1) First of all make sure you understand what your scientific question is! This sounds simple but all too often people want to get too much out of one experiment and end up getting in a bit of a mess. Better to answer one question well, than two questions badly. Once you've thought about this it should be clear whether you want analyse mRNA's for a simple differential gene expression experiment, or are after something else e.g. splicing, and also if you'll  need to look at more than just poly-adenylated mRNAs. And if possible try to determine ahead of time whether the genes you're interested in studying are highly expressed or very rare.
2) Once you've thought about this you can consider what sort of samples you have, are they low quality and/or low quantity? You should also consider who's going to do the work in the lab and who's going to analyse the sequence data?

3) Now you can really think about the final experimental design, what type f library preparation kit to use, replicate numbers, proper controls, depth of sequencing, etc. Illumina's RNA-seq buyers guide describes some of the things you'll need to consider in choosing the read-depth and run-type, and also include some tips for keeping the costs of your experiment down. 

What do people mean when they say "RNA-seq": When people say "RNA-seq" most of them are talking about differential gene expression (DGE) by sequence analysis of reverse transcribed poly-adenylated mRNAs, but by changing the depth sequencing or type of sequencing, and/or choosing a different library prep kit you can investigate so much more. The guide includes three different scenarios for RNA-seq experiments including basic differential gene expression; DGE and allele-specic expression plus isoforms, SNVs and fusions; and finally whole transcriptome analysis. These show the breadth of experiments you can consider once you've mastered this method.

The first two scenarios showcase the power of RNA-seq and demonstrate how using a single library prep method, but varying the sequencing allows very different questions to be asked of your samples. The guide recommends Illumina's TruSeq Stranded mRNA-seq kits (these are the ones we use most in my lab and we have done so ever since beta-testing the original RNA-seq kit many years ago). Scenario #1 is a simple DGE experiment and Illumina recommends you generate ≥ 10 million reads per sample, using single-end 50bp reads (SE50). Scenario #2 allows a full mRNA analysis by simply changing read depth to ≥ 25 million reads per sample, and using paired-end 75 bp reads (PE75).

If you are interested in more than poly-adenylated mRNA's then changing the RNA-seq library prep kit to Illumina's TruSeq Stranded Total RNA gets rid of ribosomal RNA's, letting you anaylse both coding and non-coding RNA. Much greater read depth is needed and Illumina recommend ≥ 50 million PE75 reads per sample. Completing the RNA-seq line-up is the TruSeq small RNA kits which allow you to analyse microRNAs and other smaller transcripts, usually this requires only ≥ 1-2 million SE50 reads per sample.

How do Illumina's recommendations stack-up: The guide is pretty good in the suggestions it makes for common RNA-seq methods. I'd aim a bit higher for DGE and suggest 20 million reads per sample to allow profiling of high, medium and lowly expressed genes.  I'm really not keen on the suggestion that MiSeq or NextSeq mid-output are good tools for RNA-seq as from my experience most experiments, with sufficient replication, will be too large to fit into a single sequencing run. I'd argue that the cheapest way to get your RNA-seq data is going to be on HiSeq 4000, until of course we can run RNA-seq on X Ten. Of course not everyone should buy a HiSeq and a MiniSeq, MiSeq or NextSeq may be a good fit for your own laboratory; but I'd encourage you to consider the benefits of using your local core lab first though, especially if you are planning on doing experiments bigger than 12-24 samples. I'm not sure I'd argue quote as strongly for paired-end data and would prefer splicing, ASE, fusion detection to be coming from higher depth sequencing instead (50M SE50 reads cost about the same as 25M paired-75bp reads).

Why does my lab focus on mRNA-seq DGE: My own choices for RNA-seq are primarily informed by the questions people say that want to answer in experimental design discussions - and nearly all of these are differential gene expression questions. As such my lab runs lots and lots of Illumina's stranded mRNA-seq kits. We only run some form of ribosomal reduction when the experiment warrants it as these methods generally require deeper sequencing for the same differential gene expression analysis power. We've very few users who need to run FFPE RNA so although we tested the RNA Access kit, we've yet to really use it in a significant project. This is partly because the research groups coming ot my lab understand the limitations of FFPE samples, and work hard to procure fresh frozen material wherever possible.

A brief bit about informatics: This article is focussed on the wetlab but without a good analysis pipeline you'll be stuck with some big but unusable Fastq files. The analysis requirements are heavily influenced by the biological questions being asked,  by the samples available, and by the library preparation and sequencing performed. I'd always recommend the user to make sure they know what analysis is likely to be performed before generating data.

Many others have weighed in on how to use and design RNA-seq experiments (see the list of my favourite references at the bottom of this post). Nearly everyone agrees that replication is key with most people suggesting 4-6 biological replicates. Most papers agree on read-depth being kept to under 20M reads per sample. The ENCODE RNA-seq guidelines are very different recommending just two biological replicate and 30M paired-end reads per sample - I've never agreed with this, even when it was published in 2011, and have steered people to other resources. The Blogosphere also offers lots of help; a 2013 post by GKNO (Marth lab, U. Utah), and the RNA-seqlopedia (U. Oregon) are two great reads for people who want to know more.

All Illumina products listed are for research use only. Not for use in diagnostic procedures (except as specifically noted).

Further reading:


  1. "This article was commissioned by Illumina Inc."

    No comments indeed.

  2. Nice summary.

    Just a quick comment. In our paper "How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?" we recommend EdgeR and DESeq2 not so much DESeq. Please can you correct that? Thanks.

    1. Hi Chris, happy to make the revision. In the paper you say that with "higher replicate numbers, minimising false positives is more important and DESeq marginally outperforms the other tools" - which was my reason for pointing to both tools.

    2. Hi James. Thanks for the revision.
      DESeq is best with >12 reps, but at that level of replication there isn't a huge amount of difference. It's also an unreasonable no. of reps for a typical expt. Best to focus on 6 or fewer reps as you have done, which is where edgeR and DESeq2 outperform the others.

  3. Really nice article. On the bioinformatics side for simple DGE at least you won't go far wrong following the Rsubread/EdgeR or DESeq2 workflows described in the papers below:

  4. We've found the NextSeq to be the best option for DGE of mRNA. Sure, it's a bit more expensive than the HiSeq 4000, but there's no waiting to fill up 8 lanes. 12 or 24 samples can be run overnight. The 75 cycle kit is also paired-end capable for no extra cost. With the additional unused cycles that come from not doing dual indexing, you can squeeze out 42bp paired end (43|6|0|43) which we've found the best for the money. We've found paired gives a small improvement in mapping, and you lose less data if you deduplicate.

  5. I have a strong preference towards HiSeq but this is because I have three instruments in the lab and we don't need to wait for flowcells to fill up. I agree that where you are dealing with lower throughput NextSeq is a good choice. However I really don't think paired-reads bring any extra benefit, but if they are "free" then why not...did you consider going for 84bp reads to increase the number of spliced-reads?

  6. I have a strong preference towards HiSeq but this is because I have three instruments in the lab and we don't need to wait for flowcells to fill up. I agree that where you are dealing with lower throughput NextSeq is a good choice. However I really don't think paired-reads bring any extra benefit, but if they are "free" then why not...did you consider going for 84bp reads to increase the number of spliced-reads?


    Am COREY ANN from URUGUAY i was suffering from genital herpes, oral herpes, shingles. Before i came in contact with dr.ogudu
    It is no longer news that the Acquired immune deficiency syndrome Herpes Virus is increasing by the day, The fear is that many people living with the sickness are scared of saying it because of the stigma that comes along with it.I am bold enough among many others to state that there is now a potent cure to this sickness but many are unaware of it.I discovered that I was infected with the virus 3 months ago, after a medical check-up. My doctor told me and I was shocked, confused and felt like my world has crumbled. I was dying slowly due to the announcement of my medical practitioner but he assured me that I could leave a normal life if I took my medications, In a bid to look for a lasting solution to my predicament, I sought for solutions from the voodoo world. I went online and searched for every powerful trado-medical practitioner that I could severe, because I heard that the African Voodoo Priests had a cure to the Herpes syndrome. It was after a little time searching the web that I came across one Dr.Ogudu website who offered to help me, He gave me some steps to follow and I meticulously carried out all his instructions. Two days ago to be precise, I went back to the hospital to conduct another test and to my amazement, the results showed that " I am NEGATIVE" You can free yourself of this Herpes virus by consulting this great African Voodoo Priest via this EMAIL [OGUDUSPELLTEMPLE@YAHOO.COM] WHATASPP NUMBER +2348106058254

  8. How I Got My Ex Husband Back...........

    I am Shannon by name. Greetings to every one that is reading this testimony. I have been rejected by my husband after three(3) years of marriage just because another woman had a spell on him and he left me and the kid to suffer. one day when i was reading through the web, i saw a post on how this spell caster on this address , have help a woman to get back her husband and i gave him a reply to his address and he told me that a woman had a spell on my husband and he told me that he will help me and after 2 days that i will have my husband back. i believed him and today i am glad to let you all know that this spell caster have the power to bring lovers back. because i am now happy with my husband. Thanks for Dr.Mako. His email: OR. his phone number: +2348108737816.or


Note: only a member of this blog may post a comment.