Merging Two Gene Expression Studies via
Cross Platform Normalization

Andrey A. Shabalin¹, Håkon Tjelmeland²,
Cheng Fan³, Charles M. Perou^3,4,5, and Andrew B. Nobel¹

¹Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599
²Department of Mathematical Sciences, Norwegian University of Science and Technology
³Lineberger Comprehensive Cancer Center, UNC-CH
⁴Department of Pathology and Laboratory Medicine, UNC-CH
⁵Department of Genetics, UNC-CH

Published in Oxford Bioinformatics on March 5, 2008.

Abstract:

Gene expression microarrays are currently being applied in a variety of biomedical applications. This paper considers the problem of how to merge data sets arising from different gene-expression studies of a common organism and phenotype. Of particular interest is how to merge data from different technological platforms. The paper makes two contributions to the problem. The first is a simple cross-study normalization method, which is based on linked gene/sample clustering of the given data sets. The second is the introduction and description of several general validation measures that can be used to assess and compare cross-study normalization methods. The proposed normalization method is applied to three existing breast cancer data sets, and is compared to several competing normalization methods using the proposed validation measures.

Download:

Manuscript.

Matlab code for XPN

Supplementary materials.
Supplementary materials contain heatmap illustration of the XPN model idea based on the real data and validation results on the intrinsic gene set.

Merging Two Gene Expression Studies via Cross Platform Normalization

Merging Two Gene Expression Studies via
Cross Platform Normalization