I don’t usually read Proteomics papers but have been thinking about how we might combine single cell genome and transcriptome sequencing - with Fluidigm’s Helios (CyToF) and have been trying to get more acquainted with Proteomics methods. In doing so I found this excellent paper: Proteomic Biomarker Discovery in 1000 Human Plasma Samples with Mass Spectrometry. The paper is probably a tour-de-force of Proteomics, even if the published results were not stunning, but not being a Proteomecist I’m not sure I’m qualified to say that. It is obvious that the group working on this put a large amount of effort into experimental design ahead of completing the mass-spec work.
|Figure from Cominetti et al 2016|
Any large project needs to consider design very carefully from considering what factors might need to be controlled for, to deciding what controls to use. The experimental design for the Human proteome paper is illustrated above; they used a controlled-randomised plate layout to remove plate confounding effects for sample origin, gender, age, ethnicity, BMI, blood pressure, glycemic indices, and clinical biochem.
Reducing mass-spec variability with tandem-mass-tags: The key to making the data comparable across what was over 300 mass-spec runs was the use of tandem-mass tags purchased from Thermo Scientific (Rockford, IL, USA), these add specific masses to all proteins in a sample allowing multiplexing of up to 24 samples per run. With a carefully designed experiment it is possible to reduce the impact of run-to-run variability. Much in the same way as we designed projects using multi-sample microarrays, the experimental groups are balanced across mass-spec runs. I’ve learnt a lot more about tandem-mass tagging in Proteomics over the last 18 months after hearing about the tech in an internal seminar. It seems that this approach is going to allow Proteomics researchers to take advantage of the statistical tools developed for gene expression array analysis. The group used a pair of control samples in each run further reducing the impact of technical variability. 304 TMT 6-plex mass-spec runs were performed, with each 6-plex containing two standards, and 4 samples. 1000 patient plasma samples were processed in 19x 96-well plates over a period of just 15 weeks. All sample handling was tracked, although they did not describe their tracking and whether they used a LIMs or not. The paper is a great example of careful experimental design and I thought was one well worth sharing outside the Proteomics community.
Between 150-200 proteins were identified and the authors argue strongly that this was only possible because f the use of TMTs. Label-free mass-spec approaches would have introduced more variability and taken significantly longer (38 weeks by their estimation). However after crunching the numbers only two proteins in Human Plasma had significant correlation with BMI. Both were shown to be associated with obesity.
NGS experimental design: We're lucky to have such a large number of sample barcodes available for NGS experiments. We can usually fit the whole experiment into one library prep plate and a single sequencing pool and remove almost all the confounding technical issues. However this does not mean we should skip careful design of NGS experiments. Taking a little time to discuss the major question(s) being asked, the samples available and the methods we'll use in both wet and dry labs is time very well spent.