Elsevier

Genomics

Volume 107, Issues 2–3, March 2016, Pages 51-58
Genomics

COPD subtypes identified by network-based clustering of blood gene expression

https://doi.org/10.1016/j.ygeno.2016.01.004Get rights and content
Under an Elsevier user license
open archive

Highlights

  • Gene interaction networks can improve the performance of clustering algorithms on gene expression data.

  • Network-informed clustering identifies clinically distinct subgroups of smokers based on blood gene expression.

  • Subtype-specific blood gene expression signatures include genes that are smoke-responsive in independent experiments.

Abstract

One of the most common smoking-related diseases, chronic obstructive pulmonary disease (COPD), results from a dysregulated, multi-tissue inflammatory response to cigarette smoke. We hypothesized that systemic inflammatory signals in genome-wide blood gene expression can identify clinically important COPD-related disease subtypes, and we leveraged pre-existing gene interaction networks to guide unsupervised clustering of blood microarray expression data. Using network-informed non-negative matrix factorization, we analyzed genome-wide blood gene expression from 229 former smokers in the ECLIPSE Study, and we identified novel, clinically relevant molecular subtypes of COPD. These network-informed clusters were more stable and more strongly associated with measures of lung structure and function than clusters derived from a network-naïve approach, and they were associated with subtype-specific enrichment for inflammatory and protein catabolic pathways. These clusters were successfully reproduced in an independent sample of 135 smokers from the COPDGene Study.

Keywords

Gene expression
Smoking
Chronic obstructive pulmonary disease
Disease subtypes
Network analysis

Cited by (0)