Document Type



This item is available under a Creative Commons License for non-commercial use only


Computer Sciences

Publication Details

Dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computing (Stream), 2018.


This research project investigates the predictive capability of clickstream data when used for the purpose of mortgage arrears prediction. With an ever growing number of people switching to digital channels to handle their daily banking requirements, there is a wealth of ever increasing online usage data, otherwise known as clickstream data. If leveraged correctly, this clickstream data can be a powerful data source for organisations as it provides detailed information about how their customers are interacting with their digital channels. Much of the current literature associated with clickstream data relates to organisations employing it within their customer relationship management mechanisms to build better relationships with their customers. There has been little investigation into the use of clickstream data in credit scoring or arrears prediction. Since the financial meltdown of 2008, financial institutions have being obliged to have mechanisms in place to deal with mortgage accounts which are in arrears or have a risk of entering arrears. A potentially crucial step in this process is the ability of an institution to accurately predict which of their mortgage accounts may enter arrears. In addition to traditional demographical and transactional data, this research determines the impact clickstream data can have on an arrears prediction model. A multitude of binary classifiers were reviewed in this arrears prediction problem. Of these classifiers, ensembles models proved to be the highest performing models achieving reasonably high recall accuracies without the inclusion of clickstream data. Once clickstream data was added to the models, it led to marginal increases in accuracy, which was a positive result.