Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

Enumerating the gene sets in breast cancer, a "direct" alternative to hierarchical clustering

Abstract

Abstract Background Two-way hierarchical clustering, with results visualized as heatmaps, has served as the method of choice for exploring structure in large matrices of expression data since the advent of microarrays. While it has delivered important insights, including a typology of breast cancer subtypes, it suffers from instability in the face of gene or sample selection, and an inability to detect small sets that may be dominated by larger sets such as the estrogen-related genes in breast cancer. The rank-based partitioning algorithm introduced in this paper addresses several of these limitations. It delivers results comparable to two-way hierarchical clustering, and much more. Applied systematically across a range of parameter settings, it enumerates all the partition-inducing gene sets in a matrix of expression values. Results Applied to four large breast cancer datasets, this alternative exploratory method detects more than thirty sets of co-regulated genes, many of which are conserved across experiments and across platforms. Many of these sets are readily identified in biological terms, e.g., "estrogen", "erbb2", and 8p11-12, and several are clinically significant as prognostic of either increased survival ("adipose", "stromal"...) or diminished survival ("proliferation", "immune/interferon", "histone",...). Of special interest are the sets that effectively factor "immune response" and "stromal signalling". Conclusion The gene sets induced by the enumeration include many of the sets reported in the literature. In this regard these inventories confirm and consolidate findings from microarray-based work on breast cancer over the last decade. But, the enumerations also identify gene sets that have not been studied as of yet, some of which are prognostic of survival. The sets induced are robust, biologically meaningful, and serve to reveal a finer structure in existing breast cancer microarrays.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View