ISMB'99
Automatic extraction of biological information from scientific text
protein-protein interactions
Christian Blaschke+, Miguel A. Andrade*, Christos Ouzounis* and Alfonso Valencia+
+ Protein Design Group, CNB-CSIC. Cantoblanco, E-28049 Madrid, Spain
* European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
Abstract
We describe the basic design of a system for automatic detection of protein-protein
interactions extracted from scientific abstracts. By restricting the
problem domain and imposing a number of strong assumptions which include
pre-specified protein names and a limited set of verbs that represent actions,
we show that it is possible to perform accurate information extraction. The
performance of the system is evaluated with different cases of real-world
interaction networks, including the Drosophila cell cycle control. The results
obtained computationally are in good agreement with current biological
knowledge and demonstrate the feasibility of developing a fully automated
system able to describe networks of protein interactions with sufficient
accuracy.
Contact: valencia@cnb.uam.es