IMR Press / FBE / Volume 4 / Issue 6 / DOI: 10.2741/e532

Frontiers in Bioscience-Elite (FBE) is published by IMR Press from Volume 13 Issue 2 (2021). Previous articles were published by another publisher on a subscription basis, and they are hosted by IMR Press on imrpress.com as a courtesy and upon agreement with Frontiers in Bioscience.

Open Access Article

Comparison and evaluation of network clustering algorithms applied to genetic interaction networks

Show Less
1 LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, China
2 Center for Theoretical Biology, Peking University, Beijing 100871, China
3 State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing 102206, China
4 Center for Statistical Genetics, Pennsylvania State University, Hershey, Pennsylvania, USA
5 School of Physics, Peking University, Beijing 100871, China
6 Center for Statistical Science, Peking University, Beijing 100871, China

Academic Editor: Rongling Wu

Front. Biosci. (Elite Ed) 2012, 4(6), 2150–2161; https://doi.org/10.2741/e532
Published: 1 January 2012
(This article belongs to the Special Issue Dynamic genetics and genomics)
Abstract

The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes, Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets, the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.

Keywords
Genetic interaction
Network
Clustering algorithm
Jaccard index
Comparison
Epistatic miniarray profiles
Share
Back to top