Document Type



This item is available under a Creative Commons License for non-commercial use only


Computer Sciences

Publication Details

A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computer Science (Data Analytics)


Churned customers identification plays an essential role for the functioning and growth of any business. Identification of churned customers can help the business to know the reasons for the churn and they can plan their market strategies accordingly to enhance the growth of a business. This research is aimed at developing a machine learning model that can precisely predict the churned customers from the total customers of a Credit Union financial institution. A quantitative and deductive research strategies are employed to build a supervised machine learning model that addresses the class imbalance problem handled feature selection and efficiently predict the customer churn. The overall accuracy of the model, Receiver Operating Characteristic curve and Area Under the Receiver Operating Characteristic Curve is used as the evaluation metrics for this research to identify the best classifier. A comparative study on the most popular supervised machine learning methods – Logistic Regression, Random Forest, Support Vector Machine (SVM) and Neural Network were applied to customer churning prediction in a CU context. In the first phase of our experiments, the various feature selection techniques were studied. In the second phase of our study, all models were applied on the imbalance dataset and results were evaluated. SMOTE technique is used to balance the data and then the same models were applied on the balanced dataset and results were evaluated and compared. The best over-all classifier was Random Forest with accuracy almost 97%, precision 91% and recall as 98%.