About Authors

Bishnu SARKER

Bishnu is a Second Year PhD student in CAPSID Team in INRIA Grand-Est Center


DAVID W. RITCHIE

Dave is a Senior Researcher and Director of CAPSID Team in INRIA Grand-Est Center


Sabeur Aridhi

Sabeur is Associate Professor in Telecom-Nancy, University of Lorraine, Nancy.


About GrAPFI

Huge numbers of protein sequences are now available in public databases. In order to exploit more fully this valuable biological data, these sequences need to be annotated with functional properties such as Enzyme Commission (EC) numbers and Gene Ontology terms. The UniProt Knowledgebase (UniProtKB) is currently the largest and most comprehensive resource for protein sequence and annotation data. In the March 2018 release of UniProtKB, some 556,000 sequences have been manually curated but over 111 million sequences still lack functional annotations. The ability to annotate automatically these unannotated sequences would represent a major advance for the field of bioinformatics. Here, we present a novel network-based approach called GrAPFI for the automatic functional annotation of protein sequences.
The underlying assumption of GrAPFI is that proteins may be related to each other by the protein domains, families, and super-families that they share.
Several protein domain databases exist such as InterPro, Pfam, SMART, CDD, Gene3D, and Prosite, for example.
Our approach uses Interpro domains, because the InterPro database contains information from several other major protein family and domain databases. Our results show that {\em GrAPFI} achieves better EC number annotation performance than several other previously described approaches.

If you are using GrAPFI for your research, Please cite it as:
Bishnu Sarker, David W. Ritchie, and Sabeur Aridhi, Exploiting Complex Protein Domain Networks for Protein Function Annotation, In Proc. of International Conference on Complex Networks and Their Applications, University of Cambridge, UK, 11-13 December, 2018.
The pri-print is avaiable here: https://hal.inria.fr/hal-01920595