Document Type



This item is available under a Creative Commons License for non-commercial use only


1.2 COMPUTER AND INFORMATION SCIENCE, Computer Sciences, Information Science, Bioinformatics

Publication Details

A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computer Science (Data Analytics)


Approaching the task of coherence assessment of a conversation from its negative perspective ‘confusion’ rather than coherence itself, has been attempted by very few research works. Influencing Embeddings to learn from similarity/dissimilarity measures such as distance, cosine similarity between two utterances will equip them with the semantics to differentiate a coherent and an incoherent conversation through the detection of negative entity, ‘confusion’. This research attempts to measure coherence of conversation between a human and a conversational agent by means of such semantic embeddings trained from scratch by an architecture centralising the learning from the distance between the embeddings. State of the art performance of general BERT’s embeddings and state of the art performance of ConveRT’s conversation specific embeddings in addition to the GLOVE embeddings are also tested upon the laid architecture. Confusion, being a more sensible entity, real human labelling performance is set as the baseline to evaluate the models. The base design resulted in not such a good performance against the human score but the pre-trained embeddings when plugged into the base architecture had performance boosts in a particular order from lowest to highest, through BERT, GLOVE and ConveRT. The intuition and the efficiency of the base conceptual design is proved of its success when the variant having the ConveRT embeddings plugged into the base design, outperformed the original ConveRT’s state of art performance on generating similarity scores. Though a performance comparable to real human performance was not achieved by the models, there witnessed a considerable overlapping between the ConveRT variant and the human scores which is really a great positive inference to be enjoyed as achieving human performance is always the state of art in any research domain. Also, from the results, this research joins the group of works claiming BERT to be unsuitable for conversation specific modelling and embedding works.