The Protein Interaction Extraction System
Participants: See-Kiong Ng, Limsoon Wong
Background
A large part of the information required for biology research can
only be found in free-text form, as in MEDLINE abstracts, or in
comment fields of relevant reports, as in GenBank feature table annotations.
This information is important for many types of analysis, such as
classification of proteins into functional groups, discovery of new
functional relationships, maintenance of information on material
and methods, extraction of protein interaction information, and so on.
However, information in free-text form is very difficult for
automated systems to use.
The project investigates techniques and applications of natural
language processing to the extraction of biological information
from free text.
Achievements
-
We initiated the first-ever track dedicated to bio-literature mining
in the PSB series of bioinformatics conferences. We chaired the track
for 3 years, until the field developed into a major trend in bioinformatics
research.
-
We developed one of the first bio-literature mining systems, called PIES,
for protein interaction information extraction from literature.
-
We also developed detailed and rigorous workflows for curating information
from biological and medical literature.
-
PIES and our curation workflows were commercialized in 2001 through a spin-out
company, Molecular Connections.
The company has since grown profitably to ~700 engineers and
curators in 2010, serving major pharmaceutical companies such as
GlaxoSmithKline, academic institutions such as UCLA, publishers such
as Chemical Abstracts Service, etc.
- Awards conferred on Molecular Connections:
- The 2008 ICICI-CNBC Emerging India Small Enterprise of the Year Award,
- The 2009 Indian Institute of Economic Studies' Award for Excellence,
- The 2009 Indian Institute of Economic Studies's Udyog Rattan Award
(for Jignesh Bhatee, CEO of Molecular Conenctions)
Selected Publications
-
See-Kiong Ng and Marie Wong.
Toward Routine Automatic Pathway Discovery from On-Line Scientific
Text Abstracts.
Proceedings of 10th International Conference on Genome Informatics,
Tokyo, Japan, December 14-15, 1999, pages 104--112.
PDF
-
Tatsuhiko Tsunoda, Limsoon Wong.
Natural Language Processing for Biology.
Proceedings of Pacific Symposium on Biocomputing 2000,
Hawaii, January 2000, pages 491--492.
-
Limsoon Wong.
A Protein Interaction Extraction System.
Proceedings of Pacific Symposium on Biocomputing 2001,
Hawaii, January 2001, pages 520--530.
PS,
PPT
-
Jun-ichi Tsujii, Limsoon Wong.
Natural Language Processing and Information Extraction in Biology.
Proceedings of Pacific Symposium on Biocomputing 2001,
Hawaii, January 2001, pages 372--373.
-
Lynette Hirschman, Jong C. Park, Junichi Tsujii, Cathy Wu, Limsoon Wong.
Literature Data Mining for Biology.
Proceedings of Pacific Symposium on Biocomputing 2002,
Hawaii, January 2002, pages 323--325.
PDF
-
Limsoon Wong.
Gaps in Text-based Knowledge Discovery for Biology.
Drug Discovery Today. 7(17):897--898, 2002. (Reviewed invited comments.)
PDF
-
Lynette Hirschman, Jong C. Park, Junichi Tsujii, Limsoon Wong, Cathy H. Wu.
Accomplishments and Challenges in Literature Data Mining for Biology.
Bioinformatics, 18:1553--1561, December 2002.
PS
Selected Presentations
-
Limsoon Wong.
Practical Knowledge Management for Drug Discovery.
Invited talk at Biotechnology in Asia 2002, Singapore, April 2002.
-
Limsoon Wong.
Building Gene Networks by Information Extraction,
Cleansing, and Integration.
Invited plenary lecture at 3rd International Symposium on
e-Biology Initiative: Towards New Frontiers of Biology,
Takeda Hall, University of Tokyo, Tokyo, Japan, 11 March 2005.
PPT
-
Limsoon Wong.
Some Interesting Issues in Constructing Gene/Protein Networks.
Invited round-table presentation at 3rd International Symposium
on e-Biology Initiative: Towards New Frontiers of Biology,
National Institute of Informatics, Tokyo, Japan, 10 March 2005.
PPT
-
Limsoon Wong.
Building Gene Networks by Information Extraction,
Cleansing, and Integration.
Invited talk at Tamkang University, Taiwan, 31 May 2005.
PPT
-
Limsoon Wong.
Building Gene Networks by Information Extraction,
Cleansing, & Integration.
Invited talk at National Taiwan University, Taipei, Taiwan, 1 June 2006.
PPT
-
Limsoon Wong.
Building Gene Networks by Information Extraction,
Cleansing, and Integration.
Invited talk at Bioinformatics Minisymposium---from Sequences,
Structures, to Systems,
National Chiao Tung University, Hsinchu, Taiwan, 2 June 2006.
PPT
-
Limsoon Wong.
Building Gene Networks by Information Extraction,
Cleansing, and Integration.
Invited talk at Yang Ming National University, Taiwan, 6 June 2006.
PPT
Acknowledgements
This project is supported in part by
an EDB grant to establish the NUS Bioinformatics Centre (96 - 98) and
an NSTB grant LS/99/001/B (10/99 - 9/02).
Last updated: 11/7/10, Limsoon Wong.