Thursday, 11 June 2015

RNA-seq contextualised: what's possible in 2D and in situ RNA sequencing

When George Church talks about something it often turns out to be a good idea to listen. He's been talking about in situ sequencing for many years and the technology looks to be ready for take off. It could be a niche method used by a few very skilled groups, but if companies like Spatial Transcriptomics get their way we'll be using it routinely. I think in situ sequencing could be a massive, but we'll see over the next eighteen months or so if I'm right. A big question remains over how easily the technology can move from sequencing RNA in the cytoplasm to DNA in the nucleus. Being able to call mutations in cells from a tissue biopsy would be great, but the inaccessibility of DNA might mean we'll stick to expressed mutations for now.

A simplified workflow of the Spatial Transcriptomics technology
RNA-seq has become probably the most common NGS method used, and has all but killed microarrays for differential gene expression. Many studies also report fluorescent in situ hybridisation (RNA-FISH) as a validation tool. RNA-seq shows the differential gene expression in bulk cells from a tissue, RNA-FISH confirms if this DGE is located in specific cells within a tissue. Most FISH is being done on a single gene at a time, although some groups are doing very highly multiplexed FISH which could potentially compete with in situ RNA-seq.

A short history: A PubMed search for "In Situ sequencing" returns several very interesting papers mostly coming out of Harvard, in the USA, and SciLifeLabs in Sweden. The first in situ sequencing method was developed for Sanger sequencing in 2001. In Picogram Cloning and Direct In Situ Sequencing of DNA from Gel Pieces Meijerink et al demonstrated that BigDye reactions could be performed from DNA in gel slices. Perhaps inspiration for polony-seq where gel is added? In 2003 Rob Mitra et al published the first demonstration of FISSEQ: Fluorescent in situ sequencing on polymerase colonies. This describes polony (polymerase colony) amplification of nucleic acids in acrylamide gel and demonstrated that 8bp sequencing reads could be achieved (get the paper here).

Two methods have recently been published; FISSEQ and targeted RNA-seq, and a third was described at AGBT13. The technology allows sequencing from cells without dissociation, often a pre-cursor to single-cell sequencing, and the ability to retain information about cellular architecture is likely to make interpretation of the results easier, and probably highlight some interesting biology so far unseen (or at least misunderstood). 

FISSEQ - in situ mRNA-seq: The FISSEQ paper builds on earlier work by Church et al where they demonstrated that polymerase trapping and nucleotide extension in polyacrylamide could produce 8bp sequencing reads. They discussed what applications might suit FISSEQ, as well as what the limiting factors might be: mispriming, misincorporation, and incomplete extension. They certainly seem to have made significant progress over the last 12 years, and the work was described, quite rightly, as a tour de force in a Nature Niews and Views feature.

The FISSEQ method works with fixed tissue and uses random hexamers tailed with a sequencing adaptor for in situ reverse transcription, cDNA is then circularised and amplified via RCA to generate DNA nanoballs (each containing multiple cDNA copies of the RNA template that are covalently attached to surrounding macromolecules keeping them in place within the cell during processing), before sequencing with a SOLiD-like protocol for up to 27bp reads with 99.4% accuracy. They demonstrated RNA-seq in multiple cell/tissue types, and in whole mount embryos. By reducing the number of molecules sequenced they could detect multiple transcripts per cell. Comparison to Illumina RNA-seq and microarrays was good, and correlation improved with increased read depth up to about 10 reads per transcript. FISSEQ generated 43% mRNA reads, 43% rRNA, 7% ncRNA, and 7% asRNA. They were able to demonstrate that nuclear RNA was twice as likely to be non-coding, and antisense mRNA was almost twice as likely to be nuclear. A functional analysis of fibroblasts during simulated wound healing generated over 100,000 RNA-seq reads and the top 100 genes were enriched for GO terms associated with wound healing.

In their discussion they propose that the poor correlation of FISEQ to RNA-seq for genes involved in RNA and protein processing may be due to the inaccessibility of some cellular structures or classes of RNA. Several blogs have posted about FISSEQ: RNA-seq here, here, and here, greenfluorescentblog, Medgadget, CoreGenomics, check them out as well. The ability to investigate mRNA localisation is likely to shine a new light on where certain celluar functions take place, perhaps identifying transcripts associated with senescence-associated heterochromatin foci? Greenfluorescentblog makes the case for using FISSEQ in highly polarised cells like neurons to investigate somatic vs dendritic or axonic RNAs - Church et al were collaborating with the Allen Institute for Brain Science!
 
Padlock RCA-seq - targeted in situ RNA-seq: this is one of at least two methods being developed at SciLifeLabs (the other being Spatial Transcriptomics described below). It starts with cDNA being generated in situ using an LNA modified RT primer, padlock probes are used to target specific RNAs which are then clonally amplified by RCA before sequencing by ligation to generate on average twenty-five 4bp reads per cell with 98.6% accuracy, achieving 97% specificity when discriminating between Human and mouse  sequences. It is the counting of  padlock-probe RCA products that determines differential gene expression, the short reads can be used to discriminate sequence and may be useful in detection of expressed mutations. They also sequenced gene-specific barcodes added to the padlock probes to determine differential gene expression of 31 transcripts in a tissue section (see figure 3 reproduced below). I thought it interesting that the SciLifeLabs group used the same sequencing method as developed by George Church and commercialised by Complete Genomics, but George chose to use SOLiD sequencing.

Figure 3 from the Padlock- RCA-seq paper

Spatial Transcriptomics - in situ mRNA-seq: (see image at the top of this post) this is the second of at least two methods being developed at SciLifeLabs (the other being Padlock RCA-seq above). This in situ mRNA-seq method, developed by Joakim Lundeberg and Patrik Stahl) does not generate the sequence data in situ. Rather it makes 1st strand cDNA in situ, that after pretty standard RNA-seq library prep can be mapped back to an H&E image to return spatial information. This is achieved by using a histology slide that has been arrayed with oligo-dT, where each spot contains a unique barcode 5' to the dT. Cellular mRNA hybridises to the spots of barcoded oligo-dT directly below the cell. The density of spots is the limiting factor in resolution and this means the method could generate very high-resolution data. And as the sequencing is done on standard Ilumina instruments read depth can be as high or low as you like, although reads are restricted to the 3' end of transcripts (not a big problem for most apps). The method is being commercialised by Spatial Transcriptomics who are now offering early access, although only on Mouse brain tissue, and at quite a high price per sample. The technology is only available as a service for now and from a limited number of tissues. Genomeweb had coverage in May.

If Spatial Transcriptomics can make this a robust techique where they can send high-density slides to any lab that wants to use the technology, then they will be onto a winner. Spotted microarrays can be prodcued at very low cost and in high-volume and I could see this technology becoming the dominant one of the three metntioned here. However if the technology can only be run in service labs, and remains orders of magnitude higher than tradtional RNA-seq it will stay a niche method - albeit an important one.

Other methods: It is possible to reconstruct spatial information; Aviv Regev et al developed the computational tool Seurat to infer individual cells spatial origin by combining RNA in situ hybridisation with single-cell sequencing; and Marioni et al used a similar approach in their 2014 Nat Biotech paper but used publicly available ISH gene expression atlases. It is also possible to perform serial sectioning and sequencing Cryo-seq, where RNA is extracted from each section allowing transcriptomes to be reassembled as demonstrated in 25 µm slices from Drosophila embryos (see Michael Eisen's blog for comments on this paper) or TOMO-seq in Zebrafish. Of course these methods require many RNA-seq libraries to be made (the TOMO-seq method simplifies this last bit). Another "in situ sequencing" method that came up when I was reading around this subject was The Shendure lab's in-situ library prep of high molecular weight DNA in a flowcell and contiguity sequencing; although I've not seen any other publications using this method.

Single-cell or in situ sequencing: Of course the many single-cell methods will compete with in situ sequencing in some applications. I don't think it's sensible to try and choose between the two methods as both are being used in different niches, I suspect we'll see convergance for some experiments and debates about the merits of one technology over another. For now we're just going to have to learn as much as possible about both and apply them as best seems to fit the experimental questions in hand.

The future for in situ sequencing: I think Spatial Transcriptomics could be the easiest method for labs to adopt, but only if it is available in kit form.

In a recent review of in situ sequencing Marco Mignardi and Mats Nilsson from SciLifeLab discuss the impact that this technology might have on the clinic. Having worked on ERBB2 for many years, and previously developing an assay for diagnotsic use, it would be great to think that the in situ seq for ERBB2 presented in the Nature paper could be used clinically. However the ease, low-cost and realtively high sensitivity of IHC/FISH make in situ ERBB2 DNA and/or RNA-seq an unlikely target. They make the strong case for using in situ sequencing to better understand the contribution of the tumor microenvironment to cancer growth and/or tumor clonal evolution.

One of the challenges discussed in the papers is the limit on the number of molecules that can be sequenced, simply as a factor of how well individual RCA products (analogous to clusters on an Illumina flow cell) can be resolved. But Church et al report up to 400 reads per cell, and the increasing density of sequencing reads on all platforms has been one of the major factors in decreasing sequencing costs.

Of course RNA is not the end game, just the start - and probably the easiest place to start. Mutation detection, CNV analysis, fusion detection (TMPRSS2-ERG in Prostate Cancer for instance) and in situ proteomics by mass spec are all on the way. Another paper by Mats Nilssen demonstrated how in situ sequencing can be used for mutation detection (albeit from cDNA) using mutation-specific padlock probes with mismatches at their 3 ́-end. DNA ligase does not tolerate single nucleotide differencesand effectively discriminates between wild-type or specifc mutation. Padlock probes also contain a site for fluorescent reporters, gree or red, for wt or mut respectively. They applied multiplexed probes totissue microarrays to demonstrate high-throughput mutation screening. A recent Nature Reviews Genetics innovation article Hubrecht Institute scientsits Nicola Crosetto, Magda Bienko & Alexander van Oudenaarden discuss the realities of moving beyond transcriptomics. In the reivew they compare FISH, FISSEQ, ISS, LCM, Mass Cytometry, and other methds. They finish up by suggesting the overlay of a tissue section over a MinION flowcell such that DNA/RNA can be read directly from the cell of origin above the pore (see figure below). Sounds like science fiction - but we all know how that goes in the world of NGS!


No comments:

Post a Comment