Tuesday, 3 May 2016

How many reads to sequence a genome?

Last year I posted about the Lander-Waterman equation used to calculate the number of reads needed to sequence a sample. I explained that this general equation (C = LN/G) can be rearranged to allow you to compute the number of reads (N) to sequence a genome of known size (G) with specific coverage (C) and using reads of a specified length (L). Today I finally finished my Calculoid NGS reads calculator to make it easier to access..feel free to use and reuse as you see fit.

Friday, 29 April 2016

SPRI alternatives for NGS

SPRI beads, generally in the form of AMPureXP beads, are almost ubiquitous in genomics applications such as library prep for NGS. The most popular thing I've ever written was a post on this blog four years ago: "How do SPRI beads work?" with almost 100,000 readers - people obviously want to understand how this wonderful technology can be used. They'd also like it to be cheaper as the Agencourt AMPureXP product (now Beckman) is somewhat expensive - so I've taken a look at some of the alternatives on the market, including DIY SPRI!

Thursday, 28 April 2016

The junior doctors strike...as explained to me

I met two dads last night, one a junior doctor picking his daughter up from the same Karate class my son; the other, a consultant, at the pub after a long day (covering for junior doctors). Both of them spoke about the junior doctors strike and some of what they said is mind boggling. How have we come to such a sorry state in the NHS? Are doctors simply overpaid since contracts were changed by Blair and co? Are the managers of the NHS just after balancing a budget with no thought for the impact their decisions might have on other departments? And does Jeremy Hunt deserve to have his name used as Cockney rhyming slang?

This is how the situation was explained to me:
  • A BMJ paper in September 2015 showed that patients are 10-15% more likely to die if they are admitted on a weekend. [This work has been criticised as flawed due to the complexities of comparing patients admitted on different days - you may actually be sicker at the weekend. But the paper has an errata as it turned out that one of the authors Bruce Keogh (medical director of NHS England) is "a long standing proponent of improving NHS services seven days a week" why the BMJ should have to "have requested that this be included in the authors’ conflict of interest statement" I don't know - seems obvious to include it to me.]
  • Because of this the government in England has decided that NHS services should be available seven days a week.
  • They have proposed a new contract for junior doctors that changes the hourly rate they get after 7pm and at the weekends, and they won't get annual pay increments, instead pay progression will be linked to training.
  • Junior doctors already work weekends.
  • Junior doctors say the new contract will not add new doctors to the system, but it will make them cheaper to pay at the weekend.
  • Both parties cannot come to an agreement so junior doctors have gone on strike.
How will pay change: The Financial Times has a nice graphic showing how the current pay structure compares to those proposed by the government or the BMA. And the BBC also have a graphic showing the junior doctors rates for unsocial hours, I used their numbers to try and see how different the positions are and it really did not seem to be worth striking over - I make the BMAs suggestion as cheaper that the governments!?!

From FT.com

The aim appears to be to get more doctors working at weekends, and it should make paying them at weekends cheaper than today so money will be saved. But the consultant I spoke to used a single example to show how much waste is happening in the NHS - and that fixing this would have far more impact than changing junior doctors rates of pay.

An example of NHS waste: The consultant described an elderly patient who needed a mental health referral before discharge, however the person who would do this was on holiday for a week, so the consultant's patient has to stay in hospital. This blocks the bed for 7 days at around £400 per night (that's the equivalent of The Goring in London, or The Waldorf Astoria in New York). If 10 patients are in the same situation then the bill for this persons one weeks holiday comes to £28,000 or equivalent to a junior doctors salary...and most people get six weeks holiday in the NHS.

The NHS does wonderful things for the UK but successive governments have seriously messed it up. My wife recently spent time in Addenbrookes and whilst she received adequate care the communication from doctors and nurses was abysmal, the food can not be described using any positive adjectives, and the accommodations are pretty terrible given the cost of the bed for the night. I'd love to sing the praises of the NHS but it does feel like big changes are needed...but I am not suggesting I can come up with any useful ideas!

Thursday, 21 April 2016

Fluidigm's single cell issues appear to be fixed..sort of

Fluidgm hosted a webinar this afternoon to describe the performance of the redesigned IFCs for medium-sized cells (10-17um), they've also released a whitepaper on their website describing their work. The redesigned medium-cell 96 IFCs have a >4-fold reduction in doublet rate at around 7%.

This is a significant improvement (and I've written up my notes from the webinar below)...but is this level of doublets low enough for the kind of experiments single-cell users might be planning?

Create your own user feedback survey

Monday, 11 April 2016

WGS versus Exome: what's best for cancer?

I've just started using Twitter polls and kicked things off with a couple asking followers whether they thought Cancer research (or diagnostics) is best investigated using whole genome sequencing, higher depth exomes, much higher depth amplicomes, or even long-reads. Please do take the poll and join in the conversation here or over at Twitter.

Here's a link to the cancer research poll, and here's one to the cancer diagnostics poll. Pass onto your collegaues. I'm going to follow up in a week with a post outlining my thoughts and particularly where long-reads might have a real impact in understnading structural-variation in cancer.

Thursday, 17 March 2016

How much is a 2nd hand HiSeq?

We recently bought two HiSeq 4000 instruments to run larger RNA-seq and exome projects on. Over the next few months we'll be transitioning other library types where we can, and think perhaps 65-85% of the work we do can be ported over from the 2500. Look on eBay and you can find multiple HiSeq instruments for sale, and I've been contacted by increasing numbers of people wanting to sell 2500's - but buying a 2nd hand machine is not as simple as it looks.

Wednesday, 16 March 2016

How many single cells are needed in a single-cell sequencing experiment

The landscape for gene expression analysis (and many other analyses) is moving away from measurements made in bulk tissues to single-cell methods. Bulk measurements are an average of the cells in the sample and so cannot truly reveal the subtleties of the biology in the sample, to get closer to the truth we do need to adopt single-cell methods, and this means making a choice as to which system you might run in your lab. Of course bulk measurements are still very powerful in understanding biology and we should not stop using the methods we've worked with for decades - but we should be thinking carefully about whether the specific question being asked could be better answered with, or only answered with single cell methods.

I started writing this post because I'm getting my head around the different methods for single-cell analysis. I'm trying to keep my focus on two areas right now, single-cell mRNA-seq analysis and copy number variation. For both a question that comes up all the time is "How many single cells are needed in my experiment?" and right now that is not a question I feel I can give a robust answer to!

In their Nature Reviews Genetics review article Shapiro, Biezuner and Linnarsson use some relatively simple back-of-the-envelope calculations that lead them to conclude "hundreds or thousands of single cells will need to be analysed to answer targeted questions in single tissues". The current systems on offer allow capture and sequencing of cells in this range so in theory any will be usable for experiments, I've briefly summarised some of the main contenders today including DROP-seq, Fluidigm, Wafergen and 10X Genomics.

The next few years are likely to see the continued development of single-cell systems. Which platform labs should invest in is going to be difficult to answer and this feels very much like the early days of NGS when we were choosing between Illumina and SOLiD; expensive instruments, rapidly developing technologies and an uncertainty about which will come out top-dog.

I'll be adding to this table over the next few months as I look into other systems, feel free to suggest other technologies to add and do point out inaccuracies where you see them.

Thursday, 10 March 2016

Paper retracted after @Exome-seq alert on Twitter

I thought I'd post a brief update on a story from last year where I was alerted by @MattiasAine via one of my TwitterBots about a group ripping off his Karlsson 2014 paper. The evidence was pretty damning - almost identical data, analysis, and figures simply rotated in the Zhao 2015 version. I alerted the editors of Tumor Biology to the problems with the 2015 paper and it has now been retracted and taken down from the site.


An opportunity for you in the Genomics Core facility?

I've just posted an advert for a Principal Scientific Associate to join the Genomics Core as the Deputy Manager. This is going to be a key position in the team as we're building building our single-cell genomics capabilities. We're looking for someone with a PhD in a relevant subject or with equivalent experience, you'll have extensive hands on laboratory experience in molecular biology and NGS. And ideally you'll  have experience with single-cell genomics technologies. This is a senior post in our team and you'll have the opportunity to make a significant contribution to the science in our institute.

The Genomics Core at the Cancer Research UK Cambridge Institute, a department of the University of Cambridge, is one of several core facilities in one of Europe's top cancer research institutes. We are situated on the Addenbrooke's Biomedical Campus, and are part of both the University of Cambridge School of Clinical Medicine, and the Cambridge Cancer Centre. The Institute focus is high-quality basic and translational cancer research and we have several research groups with an excellent track record in cancer genomics 1, 2, 3. The majority of data generated by the Genomics Core facility is Next Generation Sequencing, and we support researchers at the Cambridge Institute, as well as nine other University Institutes and Departments within our NGS collaboration.

You can get more information about the lab on our website. You can get more information about the role, and apply on the University of Cambridge website.

Monday, 7 March 2016

A personal journey in Genomics

Eric Minikel is lead author on a paper that came out in STM in January: Quantifying prion disease penetrance using large population control cohorts. The story is an incredibly personal one involving his wife Sonia, which Eric documents at the cureFFI.org blog. The paper reports on the penetrance of variants in the prion protein gene and the risk of actually getting prion disease. They analysed 16,000 case exomes and 60,706 population controls, and verified their findings in data provided by 23andMe (read their coverage as well) and found that missense variants in PRNP are 30x more common in the population than expected. Unfortunately Erik's wife Sonia Vallabh carries one of four of these variants that virtually guarantee the disease will develop, the same one that led to her mothers death from fatal familial insomnia, hence cureFFI.org.

Fig 1 from the STM paper: PRNP variant freq. is 30x higher in controls

Eric and Sonia were not scientists when they first heard about FFI or PRNP. But they retrained and are now both at the Broad Institute in Boston, Erik is in Daniel MacArthur's lab, and Sonia is in the lab of Stuart Schreiber. The STM paper is based on an analysis of the ExAC dataset.

The results reveal the penetrance of the many variants in PRNP and change the view many doctors may have had, that any variants were associated with a 100 percent risk for developing the disease. Erik et al identified benign missense variants and showed others spanned a spectrum of penetrance from 0.1 to ~100%. They provide quantitative estimates of lifetime risk and importantly in the paper they discuss the problems of assessing penetrance and risk. Even the ExAC dataset is only "approaching the size and quality required for such analyses" which limits the study, and the sequencing data generated today are imperfect with many genomic loci presenting challenges for sequencing technologies and variant calling.

Sonia's working in Stuart Schreiber's lab trying to find treatments for human prion diseases. The STM discussion suggests that reducing Prion Protein expression may be an option. Within the ExAC data they found heterozygous loss-of-function variants in three healthy people showing no effect from a 50% reduction in gene dosage for PRNP. Reducing PRNP dosage in patients may be tolerated. Whether they can develop small molecules or other methods to achieve this will be the focus of a lot more research.

The work would have been impossible without the groundwork laid out by the ExAc team, see their preprint on the BioRxiv: Analysis of protein-coding genetic variation in 60,706 humans.