Graph-Based Protein Function Prediction

Participants: Hon Nian Chua, Zhihui Li, Guimei Liu, Wing-Kin Sung, Limsoon Wong


Background

Although sequence similarity search has been proven useful in many cases, it has fundamental limitations. First, only a fraction of newly discovered sequences have identifiable homologous genes in the current databases. Second, the most prominent vertebrate organisms in GenBank have only a fraction of their genomes present in finished sequences. New bioinformatics methods allow inference of protein function using ``associative analysis’’ of functional properties to complement the traditional sequence homology-based methods. Associative properties that have been used to infer function not evident from sequence homology include: co-occurrence of proteins in operons or genome context; proteins sharing common domains in fusion proteins; proteins in the same pathway; proteins with correlated gene expression patterns; etc.

In this project, we investigate and develop graph-based methods for inferring protein functions without sequence homology. Most approaches in predicting protein function from protein-protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbors). However, proteins that interact with the same proteins (i.e. level-2 neighbors) may also have a greater likelihood of sharing similar physical or biochemical characteristics. We are interested to find out how significant is functional association between level-2 neighbors and how they can be exploited for protein function prediction. We will also investigate how to integrate protein interaction information with other types of information to improve the sensitivity and specificity of protein function prediction, especially in the absence of sequence homology.

Objectives

In this project, we investigate and develop graph-based methods for inferring protein functions without sequence homology. In particular,

At the end of the project, we expect to have developed a robust and powerful system to predict protein functions, even in the absence of sequence homology.

Selected Publications

Dissertations

Selected Presentations

Acknowledgements

This project is supported in part by a A*STAR AGS scholarship (Chua: 8/03 - 7/07), the I2R-SOC Joint Lab on Knowledge Discovery from Clinical Data (Liu, Sung, Wong: 7/03 - 6/07), and a URC grant R-252-000-274-112 (Liu, Sung, Wong: 10/06 - 9/09).


Last updated: 9/6/09, Limsoon Wong.