AutoIndexer: Research and Development of Terminological Resources to Support the Automatic Indexing of Medical Documents

The main goal of this project is to establish a set of applications and web services for the automatic indexing of clinical documents. This algorithms must be able to give suitable answer to the current neccesities on indexing of complex semistructured document to give support to the electronical health history.

The project tries to replace the mental-manual indexing process (performed by health professional) by an automated process that was more efficient, productive and ecological.

This project is being developed in close collaboration with the Indizen Technoligies company. They have previously developed a set of solutions designed to improve productivity, quality and quantity of information encoded in healthcare centers and hospitals. Their research has focused on finding the best methodologies and semantic resources for natural language analysis of clinical reports in order to organize and classify this information based on the international coding systems, ICD-9-CM, ICD-10, and medical terminology systems like SNOMED.

Summary of research lines:

- Identification of relevant medical elements:

  • main diagnostic, secondary diagnostics, procedures, medical history, other information

- Word sense disambiguation and acronyms processing

- Ortographical correction

- Chornological classification

- Lemmatizers

- Negation detection


Funded by Programa AVANZA I+D Ministerio de Industria, Comercio y Turismo.

Junio 2009 - Diciembre 2010