Friday, 22 June 2012

Improving small and miRNA NGS analysis or an introduction to HDsRNA-seq

Small RNA biases have been very well interrogated in a series of papers released in the last 12 months. The RNA ligation has been shown to be the major source of bias and the articles discussed in this post offer some simple fixes to current protocols which should allow even better detection, quantification and discovery in your experiments.

Small RNA plays an important regulatory role and this has been revealed by almost every method possible that can be used to measure RNA abundance; northern, real-time qPCR, microarrays and more recently next-generation sequencing. These methods do not agree particularly well with each other and the most likely candidate issue is technical biases of the different platforms.

Even though it has its own biases, small RNA sequencing appears to be the best method available for several reasons, it does not rely on probe design and hybridization, you can discriminate amongst members of the same microRNA family and you can detect, quantitate and discover in the same experiment (Linsen et al ref).

Improving small RNA sequencing: As NGS has been adopted for smallRNA analysis focus has appropriately been made on the biases in library preparation. Nearly all library prep methods use ligation of RNA adapters to the 3’ and 5’ ends of smallRNAs using T4 RNA ligases, before reverse transcription from 3’ adapters and amplification by PCR. However RNA ligase has strong sequence preferences and unless addressed these lead to bias in the final results of sequencing experiments.

All four of the papers below show major improvements to RNA-seq bias for small RNA protocols.

I particularly like the experiments performed in the Silence paper using a degenerate 21-mer RNA oligonucleotide. Briefly the theory is that a 21-mer degenerate oligo has trillions of possible sequence combinations and that in a standard sequencing run each sequence should appear no more than once as only a few million sequences are read. The results from a standard Illumina prep showed strong biases for some sequences that were significantly different from the expected Poisson distribution, and where almost 60,000 sequences were found more than 10 times instead of once as expected (the red line in figure A from their paper reproduced below). When they used adapters where four degenerate bases were added to the 5′ end of the 3′ adapter and to the 3′ end of the 5′ adapter, they achieved results much closer to those expected (blue line).

I don’t think we should be too worried about differential expression studies as long as the comparisons used the same methods for both groups, the results we have are probably true. However we may well have missed many smallRNAs because of the bias and our understanding of biology is likely to be enhanced by these improved protocols.

The recent papers:

Jayaprakash et al NAR 2011: showed that RNA ligases have “significant sequence specificity” and that “the profiles of small RNAs are strongly dependent on the adapters used for sample preparation”. They strongly suggest modifications to current protocols for smallRNA library prep using a mix of adapters, "the pooled-adapter strategy developed here provides a means to overcome issues of bias, and generate more accurate small RNA profiles."

Sun et al RNA 2011: "adaptor pooling could be an easy work-around solution to reveal the “true” small RNAome."

Zhuang et al NAR 2012: showed that the biases of T4 RNA ligases is not simply sequence preference but affected by structural features of RNAs and adapters. They suggested "using adapters with randomized regions results in higher ligation efficiency and reduced ligation bias".

Sorefan et al Silence 2012: demonstrate that secondary structure preferences of RNA ligase impact cloning and NGS library prep of small RNAs. They present “a high definition (HD) protocol that reduces the RNA ligase-dependent cloning bias” and suggest that “previous small RNA profiling experiments should be re-evaluated” as “new microRNAs are likely to be found, which were selected against by existing adapters”, a powerful if worrying argument.


  1. hi James, thanks for the information. So if you were a crazy scientist who wanted to know all the miRNAs being expressed in the salivary glands of flies, would it be better to use NGS on the miRNA prep from the gland to do it, rather then use in a microarray of miRNAs (if there is such a thing?!)

    Also, speaking as a newbie, beside identifying the miRNAs present in a sample, can you also measure differences in expression levels of specific miRNAs?

  2. Hello there, I was searching about t4 ligase and I came across your blog, very informative and entertaining, it shows that your an expert in your field.
    I will definitely be back for more. Keep it up!