Incorporating Semantic Information to FastText Word Vectors for Improved Sentiment Analysis

Document Type

Theses, Masters


This item is available under a Creative Commons License for non-commercial use only


Computer Sciences

Publication Details

A dissertation submitted in partial fulfillment of the requirements of Technological University Dublin for the degree of M.Sc. in Computing (Data Analytics)


Developments in natural language processing (NLP) have lead to words being represented as dense low-dimensional vectors that capture semantic and syntactic relations. These vectors are learned through the distributional statistics from large corpora. Recently researchers have released large pre-trained word vector models to be used in further research. This allows others the opportunity to use these high quality vectors which have lead to state-of-the-art results in a number of different NPL tasks such as sentiment analysis, machine translation and natural language generation. There are drawbacks to using pre-trained vectors. One problem encountered is the issue of out-of-vocabulary words, where there is no vectors in the models vocabulary for words that were not seen during their training.

This document is currently not available here.