Document Type



Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence


Computer Sciences

Publication Details

International journal of computational linguistics and applications Vol. 2, No. 1-2, Jan-dec 2011, PP. 175-190.


To apply machine learning techniques to the production and interpretation of natural language, we need large amounts of annotated language data. Manual annotation, however, is an expensive and time consuming process since it involves human annotators looking at the data and explicitly adding information that is implicitly contained in the data, based on their judgment. This work presents an approach to automatically annotating referring expressions in situated dialogues by exploiting the interpretation of language by the participants in the dia- logue. We associate instructions concerning objects in the environment with automatically detected events involving these objects and predict the referents of referring expressions in the instructions on the basis of the objects aected by the events. We judge the reliability of these pre- dictions based on the temporal and textual distance between instruction and event. We apply our approach to an annotated corpus and evalu- ate the results against human annotation. The evaluation shows that the approach can be used to accurately annotate a large proportion of the utterances in the corpus dialogues and highlight those utterances for which human annotation is required, thus reducing the amount of human annotation required.