I'm at the 6th UK Genome Science conference in beautiful and vibrant Birmingham. There is a good attendance and the place is full with a real buzz about new technologies (more about that in a sec), the work people are doing and the freebies on the stands!
I'm not going to attempt to round up each and every day but I'll probably post tomorrow and Wednesday on my favourite bits. Here's what happened yesterday...
Daniel MacArthur: Mass Gen, Broad, Harvard spoke about the lessons from analysis of 60,000 human exomes. He was making the point that "making sense of one genome requires thes of thousands of genomes." In the ExAC (exome aggregation consortium) they've gathered almost 100,000 exomes which have then been consistently analysed as a single batch to produce a single VCF. Dan showed that by doing this they were able to remove noise from rare disease analysis - and although I'm not sure about the impact on cancer exomes it is likely to make it easier to filter variants from those datasets too.
Dan focused his talk on a subset of 60,000 unrelated high-quality exomes, which were consented for public data sharing, with no known severe paediatric disease. Nearly all of this data was generated on the Broad genomics platform; but the collection together was surprisingly smooth "the science was harder than the politics" according to Dan. The analysis found around 1 variant every 6bp! Most of these are rare, and novel; 50% singleton, 30% in 2-10 individuals but over 10% have 0.01% allele freq; also due to the size of the dataset almost 10% of variants are multi-allelic. The data show that both PolyPhen and CADD predict deleterious states pretty well. ExAC is likely to become more important for variant filtering: have a look at the ExAC Browser-beta.
Dan focused his talk on a subset of 60,000 unrelated high-quality exomes, which were consented for public data sharing, with no known severe paediatric disease. Nearly all of this data was generated on the Broad genomics platform; but the collection together was surprisingly smooth "the science was harder than the politics" according to Dan. The analysis found around 1 variant every 6bp! Most of these are rare, and novel; 50% singleton, 30% in 2-10 individuals but over 10% have 0.01% allele freq; also due to the size of the dataset almost 10% of variants are multi-allelic. The data show that both PolyPhen and CADD predict deleterious states pretty well. ExAC is likely to become more important for variant filtering: have a look at the ExAC Browser-beta.
Bill Hanage: Bill's talk was provocatively titled "Is genomics biology?", he started with some quotes from biologists; "It's nice when genomics tells you something about biology", "You're only interested in ACGTs why not go do some biology", and "I'm not a genomicist I'm a biologist"! The upshot is that Genomics can be biology, but it's up to you to make it so. Bill recommends biologists read J Maynard Smith's Mathematical ideas in biology
Mark Akeson: UCSC "Using nanopores for protein sensing". Mark started his talk by describing himself as a self-confessed ONT fanboy. He tried to convince us that his excitement in the technology is driven by a long-time association with nanopore sequencing, and the fact that ONT is working. The MinION Analysis and Reference Consortium (MARC) labs show very high consistency in accuracy >90% with 99% of 2D reads mapped. Other labs are publishing data and showing that the MinION can be used to generate fully assembled genomes e.g. E.coli, V.cholera, B.pertussis (67% GC).
He showed two biological results: Analysis of a CT47 gene copy number resolution over a 50kb gap, using a series of long reads which aligned and resolved 8 copies in the clone investigated. They normalised coverage to single-copy regions of the genome for verification. Second was a lovely base modification sequencing example of tRNA modifications in yeast. They are using DNA adapters ligated to the ends of tRNA to pull the tRNA apart and through the pore, ideally they want to optimise the motor protein. Lastly Mark talked about some protein sequencing which was very exciting. ONT are moving quickly beyond genomes.
Mark described the world as being occupied by tinkerers and out-of-the-boxers, tinkerers are likely to benefit from the MAP, others may need to wait. His top tip was to be very careful about sample prep! Any corners cut are visible in your results. Phenol chloroform is best! Possible polysaccharides glycogens etc might be problematic.
Dan Turner (Director of applications @ONT): DAn gave a brief overview of where ONT are and described work in his group on microbial genomics and complex metagenomics. Described library Prep: spoke briefly about differential lysis of mammalian/bacterial cells, DNA extraction with Qiagen 500 tips, frag, end repair A-tail, adapter ligation, tether increases efficiency of pore loading. Mentioned lrPCR of mitochondrial genomes. Described direct RNA-seq: work in progress but looking good; ligate adapters to RNA and directly seq or make 1st strand cDNA add adapter by RNA:DNA ligation.
Dan Turner (Director of applications @ONT): DAn gave a brief overview of where ONT are and described work in his group on microbial genomics and complex metagenomics. Described library Prep: spoke briefly about differential lysis of mammalian/bacterial cells, DNA extraction with Qiagen 500 tips, frag, end repair A-tail, adapter ligation, tether increases efficiency of pore loading. Mentioned lrPCR of mitochondrial genomes. Described direct RNA-seq: work in progress but looking good; ligate adapters to RNA and directly seq or make 1st strand cDNA add adapter by RNA:DNA ligation.
For the microbial applications Dan described the What's in my pot (WIMP) workflow that uses Kraken to place read k-mers in a database of genome k-mers, available through Metrichor. They used barcoded libraries to confirm WIMP was quantitative and were able to predict the sensitivity and decide on run length accordingly (Run Until). It would be wise to think about quantitative spikes using multiple genes at different conc, and make sure you use controls.
Dan described the Voltrax for automated library prep and an experiment using iChips to grow single cells in situ from river water,
Michael Schnall-Levin's (10X genomics Vice President of Computational Biology and Applications): Spoke about their phased sequencing approach and the recent NIST data release.
Cameron Frayling Base 4: Cameron gave an overview of the Base4 technology and an update on where the company is with development. I've previously written about their approach this still looks like one to watch.
Michael Schnall-Levin's (10X genomics Vice President of Computational Biology and Applications): Spoke about their phased sequencing approach and the recent NIST data release.
Cameron Frayling Base 4: Cameron gave an overview of the Base4 technology and an update on where the company is with development. I've previously written about their approach this still looks like one to watch.
All in all a great start to the conference. Roll on day 2 and the street food!
No comments:
Post a Comment
Note: only a member of this blog may post a comment.