The majority of the current methods for identifying modules in protein interaction networks are based solely on analysing topological features of the networks. In contrast, the main idea that underpins the planned thesis is that combining topological information with knowledge about protein function will result in more biologically plausible modules than using approaches based solely on topology. We here propose approaches that use a combination of domain-specific knowledge, derived from Gene Ontology, and topological properties, to generate functional modules from protein interaction networks. By using yeast two hybrid (Y2H) interactions from /S/. /Cerevisiae/ and knowledge in terms of Gene Ontology (GO) annotations, we have elucidated functional modules of interacting proteins.
In this report, a summary of the proposed approaches is presented. The methods with the same rationale but slightly different designs have been implemented, tested and evaluated. The first approach, where we combine clusters of proteins based on their mutual neighbours profiles with the corresponding clusters based on GO semantic similarity profiles, treats each of the aspects (functional knowledge and topology) separately to obtain functional clusters, and thereafter merges the clusters into one single structure. In contrast, the other approaches integrate both aspects from the beginning. The two other approaches are two versions of a method named SWEMODE (Semantic WEights for MODule Elucidation), which uses knowledge-based clustering coefficient to identify network modules. The first one is uses the original protein interaction graph, and the second one is a recently designed extension of SWEMODE where the /k/-cores of the graph are emphasised. We demonstrate that all three methods are able to identify the key functional modules in protein interaction networks.
The first method was applied to smaller well-studied networks, that are known to contain modules of signalling pathways, while SWEMODE was applied on a large network containing 2 231 proteins and 6 379 interactions. The methods were also used to study intermodule connections, which is a step towards revealing a higher order hierarchy between modules.
In this report, we describe and discuss the proposed approaches, along with their strengths and weaknesses. We also propose further extensions and improvements of the proposed methods, some of which may be attempted as the final steps in the implementation phase of the dissertation
Skövde: Institutionen för kommunikation och information , 2006. , p. 8