Monday, 6 February 2012

What did people think would be possible with a $1,000 genome five years ago?

In 2007 Nature Genetics asked "What would you do if it became possible to sequence the equivalent of a full human genome for only $1,000?" 49 people responded, and the list contains some of the great and good in Genome sciences including; Mike Stratton, James Lupski, Detlef Weigel, Leena Peltonen-Palotie, et al. It is an intersting read and I'd encourage you to take a look. I'd also recommend James Watson's The Double Helix and Robert Weinberg's One Renegade Cell, as good reads that give a point of view from the past.

The answers these people gave included a lot of speculation but covereed some common themes; personalized medicine, Cancer, epigenetics, exomes and the likely "death of arrays". The ethical challenges such cheap seqeuncing is creating came up several times. And a few answers focused on some of the more frivolous things we might be able to do with such cheap sequencing such as understanding the genomics of beauty, musical ability and even the perfect golf swing!

It is not fair to negatively comment on any specifc predictions, after all I am sure very few people on the planet thought we would be able to produce 1Tb of data in ten days on a HiSeq just five years after Illumina purchased Solexa.

Some of my favourite bits:
Michael Rhodes (Sr Manger Sequencing Portfolio at Life Technologies) pointed out "if the 6-Gb diploid human genome costs a mere $1,000, a 4-Mb haploid genome would cost $0.67." Julian Parkhill (Wellcome Trust Sanger Institute) asked the quesition "what will you do with a $1 bacterial genome?" Both of these answers resonate with a previuos post on this blog a $1000 genome requires a $1 sample prep. Perhaps they draw attantion to the fact that a $1 genome might require an even cheaper prep.

Paul Nurse (Chief Exec of the Francis Crick Institute) dared to speculate that all this sequence data might finally squash "the creationists and the intelligent designers under a mountain of base pairs". I suspect no amount of data would bury their faith in what they believe.

Stephen Scherer (Hospital for Sick Children/University of Toronto) wanted "curiosity and imagination [to] trump deep pockets" in the field of genomics. It is certainly a worry of some of the new PIs I speak to that deep pockets appear to be winning the field. However some clever thinking is giving them the chance to stay a step ahead of the huge projects and publish some fantastic work.

Lena Peltonen-Palotie (University of Helsinki) suggested sequencing only one of a monozygotic twin pair as "since their genomes are identical, and you get two phenotypes for the price of one genome sequence" and focusing on those twins that are "discordant for important diseases like schizophrenia, autism or Alzheimer disease."

James Lupski (Baylor College of Medicine) wanted to "sequence the haploid genome of 100 sperm from each of ten men in whom the diploid genomic sequence was determined." This might be almost possible today using the Nextera technology from Illumina. It is theoretically possible to seqeunce a genome with no amplification other than cluster generation on the illumina system. 1000 single-cell transoposome reactions, all individually bar-coded and pooled, then run on as many lanes as possible until the sample was exhausetd. There would be 'holes' in the genome but it's perhaps as close as we can get today.

I wondered where are the large scale population studies that Paul Nurse (Cichlid fish explosion) and Trudy Mackay (500 inbred strains of Drosophila melanogaster) mentioned? These should be simple projects to generate the primary sequence data for today, and a 1000 genome project is almost in the realms of a PhD thesis or even a graduate dissertation.

Lastly I wonder if Samir Brahmachari (IGIB, India) actually did "invest in stocks of computer hardware companies involved in the data storage business"!

4 comments:

  1. Trudy Mackay's DGRP project is alive and well (http://en.wikipedia.org/wiki/Drosophila_Genetic_Reference_Panel), with ~200 complete Drosophila genomes of data in NCBI (http://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP000694)

    ReplyDelete
  2. Flies from a farmers market, thanks for pointing it out to me.

    ReplyDelete
  3. The Drosophila Genetic Reference Panel publication was covered on GenomeWeb today!

    http://www.genomeweb.com//node/1027101?hq_e=el&hq_m=1190348&hq_l=2&hq_v=eaeb291c42

    ReplyDelete
  4. Really very huge post!! This might be almost possible today using the Next-era technology from Illumina. It is theoretically possible to sequence a genome with no amplification other than cluster generation on the illumina system. Keep it up.

    ReplyDelete

Note: only a member of this blog may post a comment.