Document Type



This item is available under a Creative Commons License for non-commercial use only


Computer Sciences

Publication Details

A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computer Science (Data Analytics)


Pharmaceutical drugs are usually rated by customers or patients (i.e. in a scale from 1 to 10). Often, they also give reviews or comments on the drug and its side effects. It is desirable to quantify the reviews to help analyze drug favorability in the market, in the absence of ratings. Since these reviews are in the form of text, we should use lexical methods for the analysis. The intent of this study was two-fold: First, to understand how better the efficiency will be if CNN-LSTM models are used to predict ratings or sentiment from reviews. These models are known to perform better than usual machine learning models in the case of textual data sequences. Second, how effective is it to migrate such information extraction models across different drug review data sets and across different disease conditions. Therefore three experiments were designed, first, an In-domain experiment where train and test data are from the same dataset. Two more experiments were conducted to examine the migration capability of models, namely cross-data source, where train and test are from different sources and cross-disease condition model training, where train and test data belong to different disease conditions in the same dataset. The experiments were evaluated using popular metrics such as RMSE, MAE, R2 and Pearson’s coefficient and the results showed that the proposed deep learning regression model works less successfully when compared to the machine learning sentiment extraction models in the literature, which were done on the same datasets. But, this study contributes to the existing literature in the quantity of research work done and in quality of the model and also suggests the future researchers on how to improve. This work also addressed the shortcomings in the literature by introducing