Referring Expressions and Rhetorical Figures for Entity Distinction and Description in Automatically Generated Discourses (extended summary in English)

TitleReferring Expressions and Rhetorical Figures for Entity Distinction and Description in Automatically Generated Discourses (extended summary in English)
Publication TypeThesis
Year of Publication2009
AuthorsHervás, R
Academic DepartmentDepartamento de Ingeniería del Software e Inteligencia Artificial
DegreePhD Thesis

The field of human-computer interaction has evolved rapidly in recent years, becoming a key element of any computer system. If a system is capable of communicating with a human being through interactions that result natural and friendly for him or her (voice, images, etc.), the user will be much more perceptive to the transmitted information and will have more trust on the application and its results.

In this regard, a key area within the human-computer interaction field is Natural Language Generation (NLG), a subfield of Artificial Intelligence and Computational Linguistics. The field of Natural Language Generation is responsible for the design and implementation of systems that produce understandable texts in human languages from an initial non-linguistic representation of information. Within this field, one of the problems to be solved in order to generate satisfactory results is to decide how to refer to entities or elements that appear in the text.

The task of Referring Expression Generation deals with this specific problem. The different references to the same element in a text should be replaced by specific ways in which to refer to them or references. The process of referring expression generation should take into account two objectives. First, a reference to an element in the discourse should allow the reader or listener to distinguish it from any other element in the context with which it could be confused. In addition, sometimes the references may contain additional information intended to describe the corresponding entities beyond the function of distinguishing.

Of these two functions (distinctive and descriptive), only the former has been widely studied in the literature. Numerous works can be found dealing with the problem of distinguishing references, confronting issues such as minimality of an expression, similarity of a expression with the ones used by human beings, absence of ambiguity in the generated reference, etc.

However, although there is some work related to the generation of natural language descriptions, there are fewer works focused on enhancing a discourse with certain expressions that highlight descriptive information considered important, or on its relationship with the generation of distinguishing references.

This work addresses the complete problem of reference planning in two different ways. Firstly, several solutions and improvements to classical referring expression generation are proposed for references that attempt to distinguish the referents from other entities in context. The problem is addressed from three fronts: how to adjust the level of abstraction employed to name the reference according to the situation, which strategy to use for choosing the attributes that distinguish a concept, and what words or expressions are more appropriate to express a reference in natural language. For each of these points we present solutions based on classical techniques and methodologies of Artificial Intelligence, such as evolutionary algorithms, case-based reasoning, or ontologies. The results obtained from the different solutions are also evaluated using classical metrics from this field.

Secondly, this work explores the enhancement of a given speech by providing descriptive information using figures of speech based on similarities between domains, such as comparison and analogy. In order to use such figures in a natural language generation system, it is necessary to address issues related to managing sources of knowledge, determining the appropriate figures, and defining an architecture to implement such systems. This work studies these issues and proposes a general framework to generate this kind of references.

The results obtained by the solutions proposed in this work lead to a discussion on the shortcomings of each approach, identifying aspects that could be improved in future work. The relationship between the generation of referring expressions (both distinctive and descriptive) and the complete process of natural language generation is also discussed.

Finally, the conclusions derived from these lines of research are presented, along with the identification of possible lines for future work and areas of application for the solutions and results presented in this work.

Full Text
PDF icon HervasPhDTesis.pdf1007.87 KB