New resampling method for evaluating stability of clusters

BY IN Journal Club, Machine learning, Methods NO COMMENTS YET , ,

A measure of cluster stability is needed to discriminate between real clusters from random ones, which arise due to random variation of gene expression measurements from both technical and biological variation.

In this study, Gana Dresen et al. propose a new “continuous weights” method to measure cluster stability that uses resampling in a manner similar to bootstrapping. But instead of drawing observations like in conventional bootstrapping, the new method draws random positive floating-point numbers to be used as weights for each observation. In this manner, each observation is represented in the resampled dataset.

In general, the “continuous weights” method perform at least as well as conventional bootstrapping. In particular, small datasets with smaller numbers of genes benefit greatly from this “continuous weights” method over conventional bootstrapping. The authors recommend this “continuous weights” method over conventional bootstrapping for real microarray gene expression data where there are factors such as the dependence and high correlation of microarray data.

Gana Dresen IM, Boes T, Huesing J, Neuhaeuser M, Joeckel KH. 2008. New resampling method for evaluating stability of clusters. BMC bioinformatics 9:42.

So, what do you think ?