|
If you know of additional (open access) biomedical article databases, please be so kind and contact me: martink@cnb.uam.es. This way you help to improve completeness of this list of text mining resources for biology and biomedicine. The most important resource for text mining, IR and IE tools in the biology domain is the PubMed (Medline) database. It provides a collection of article titles and abstracts useful for text mining tools. There are currently over 15 million entries in PubMed and over 500,000 new entries are now added every year. PubMed is also the primary database of references consulted by experimental researches for life science articles. Due to the limitation of open access articles as well as problems while parsing full text articles (difficulties of pdf conversions to text and journal specific SGML/XML formats) only a small number of applications of full text parsing have so far been developed for this domain. The NCBI provides a collection of tools for interaction with PubMed as well as the posibility to obtain copy of the whole PubMed. Refer to my links section for other useful resources for Bio-NLP. |
||