Ben Raphael's lab at Brown University have just published THetA for tumour heterogeneity analysis; it's a great paper so download it and have a look yourselves.
Solid tumours have been shown to be highly heterogeneous and that this can underlie drug resistance and eventual relapse in patients. Individual tumours are admixtures of different tumour-cell sub-populations and NGS data come from this mix. The THetA paper introduces a method to infer the starting population of distinct tumour-cells.
The authors used several datasets including that from The life history of 21 breast cancers from Peter Campbell's group at Sanger where they reported Kataegis; the "downpour" of mutations in distinct regions of cancer genomes. They used this paper as it is a great demonstration of cancers evolution; as Breast cancers develop individual cells accumulate mutations and at diagnosis one (or a very few) populations have usually become dominant. As the mutational process proceeds it leaves behind an archaeological record that can be deciphered into a history or family tree of tumour development. THetA aims to identify the sub-populations of tumour cells present in a sample.
THetA determines distinct cell-populations based on copy-number differences between those populations and requires only 40x genome coverage to do so. Figure 3A from the paper shows the copy number of the sample as determined by read depth in gray, and that for the normal cells (black), dominant clone (blue) or sub-clone (red), and the software builds a family tree of the different cell populations.
|Figure 3: from THetA paper|
They compared the THetA estimates of tumour purity to CNAnorm, ASCAT and ABSOLUTE and got very similar results. Interestingly the authors noted that all the tumour purity estimates were slightly lower than the 70% value reported in the original Kataegis paper.
Some limitations of THetA are discussed in the paper including the reliance on copy-number aberrations (however most solid tumours have some CNA); and that at least one CNA is required to be unique in each cell population to allow it to be inferred.