Print this page

Statistical integration of multiple-source high-throughput genomics and phenotypic data

General info

Date from - to
01 May 2010 - 30 Apr 2014
Project leader(s)
Menezes, de Renee Dr.

Abstract

Aim of the project:
Develop novel statistical methods to identify candidate genes whose expression
is associated with DNA copy number, sequence and methylation variation and may underlie a clinical trait. We will focus both on cancer genomics and complex diseases. 

Key objectives:

  • Develop a global test-based model that uses subgroups of markers in both dimensions, e.g. SNPs and gene expressions, to identify regions of association between genotype and phenotype (expression), as well as a method to compare patterns of association from two independent datasets to identify phenotype-specific association signatures.
  • Extend the integration model from two to three data sources, since in addition to copy number, other mechanisms are likely to play a role in aberrant expression including DNA methylation and loss-of-heterozygosity.
  • Develop methods for classification and prediction using two types of high-throughput data simultaneously, e.g. both expression and copy number.

Approach:
The use of high-density platforms to measure SNPs, copy number and methylation on the same samples is increasing. We have developed a high-throughput, generic approach to test associations between genetic variation and gene expression, representing a direct cellular phenotype, by modelling two types of microarray data using a single regression model. We are able to detect subtle effects of mild copy number alterations by taking into account multiple genes in the altered region using a random-effects model. In this project we will extend the current integration model.

    Back to list[NBIC:R:BR2.2:X]