PropSeg: determining copy number alternations
11 Sep 2012
Copy number alternations (CNAs) frequently occur in cancer and therefore tools that allow detection of (recurrent) CNAs are of high interest to cancer researchers. Originally developed to detect single nucleotide variants (SNVs), capture sequencing has proven its worth in deriving gene copy numbers as well. To assess changes in copy numbers from capture sequencing data, most approaches rely on determining log-ratios between test and control samples, but these methods are not well suited to deal with the large variety in coverage that is inherent to sequence data, resulting in information loss. These ratio-based approaches also suffer from outliers and are unable to handle homozygous deletions.
Guillem Rigaill (Netherlands Cancer Institute, Amsterdam) and colleagues in the group of NBIC Faculty Lodewyk Wessels propose a proportionality model, in which the test sample coverage is modelled as a linear function of the control sample. A major advantage of this new statistical approach is that completely deleted regions are no longer ignored in the analysis. They tested their approach by determining the copy numbers for a set of 600 genes from nine breast cancer cell lines. The new method, called PropSeg, outperformed other log-ratio based methods, while demonstrating high concordance with SNP array results.
The code for the new algorithm is available from http://bioinformatics.nki.nl/ocs
Rigaill GJ, Cadot S, Kluin RJC, Xue Z, Bernards R, Majewski IJ and Wessels LFA
A regression model for estimating DNA copy number data applied to capture sequencing data
Bioinformatics 2012, Jul 13