Incorporating Semantic Information to FastText Word Vectors for Improved Sentiment Analysis
This item is available under a Creative Commons License for non-commercial use only
Developments in natural language processing (NLP) have lead to words being represented as dense low-dimensional vectors that capture semantic and syntactic relations. These vectors are learned through the distributional statistics from large corpora. Recently researchers have released large pre-trained word vector models to be used in further research. This allows others the opportunity to use these high quality vectors which have lead to state-of-the-art results in a number of different NPL tasks such as sentiment analysis, machine translation and natural language generation. There are drawbacks to using pre-trained vectors. One problem encountered is the issue of out-of-vocabulary words, where there is no vectors in the models vocabulary for words that were not seen during their training.
Hayden, C. (2019) Incorporating Semantic Information to FastText Word Vectors for Improved Sentiment Analysis, Masters Thesis, Technological University Dublin.