Document Type
Dissertation
Rights
This item is available under a Creative Commons License for non-commercial use only
Disciplines
Computer Sciences
Abstract
In a world where anybody can share their views, opinions and make it sound like these are facts about the current situation of the world, Fake News poses a huge threat especially to the reputation of people with high stature and to organizations. In the political world, this could lead to opposition parties making use of this opportunity to gain popularity in their elections. In the medical world, a fake scandalous message about a medicine giving side effects, hospital treatment gone wrong or even a false message against a practicing doctor could become a big menace to everyone involved in that news. In the world of business, one false news becoming a trending topic could definitely disrupt their future business earnings. The detection of such false news becomes very important in today’s world, where almost everyone has an access to use a mobile phone and can cause enough disruption by creating one false statement and making it a viral hit. Generation of fake news articles gathered more attention during the US Presidential Elections in 2016, leading to a high number of scientists and researchers to explore this NLP problem with deep interest and a sense of urgency too. This research intends to develop and compare a Fake News classifier using Linear Support Vector Machine Classifier built on traditional text feature representation technique Term Frequency Inverse Document Frequency (Ahmed, Traore & Saad, 2017), against a classifier built on the latest developments for text feature representations such as: word embeddings using ‘word2vec’ and sentence embeddings using ‘Universal Sentence Encoder’.
DOI
https://doi.org/10.21427/5519-h979
Recommended Citation
Sriram, S. (2020). An evaluation of text representation techniques for fake news detection using: TF_IDF, word embeddings, sentence embeddings with linear support vector machine. Masters Dissertation. Technological University Dublin. DOI:10.21427/5519-h979
Publication Details
A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computer Science (Data Analytics)