Predicting Process Completion Using Clickstream Data

Document Type

Theses, Masters


This item is available under a Creative Commons License for non-commercial use only


Computer Sciences

Publication Details

Successfully submitted in fulfilment of the requirements for the degree of MSc. in Computing (Data Analytics) in the School of Computing College of Sciences and Health, October, 2015.


This dissertation evaluates the potential use of clickstream data to identify customers in need of assistance while completing a loan application online. An ever increasing number of people are migrating their day-to-day banking activity to online channels. Companies who pride themselves on customer service are faced with the challenge of sustaining those standards via their online presence. Typically, web mining and click pattern analysis is largely based upon a history of page requests to a server. This paper utilises a more granular data source captured from browser event logs. This method of web data capture and analysis allows for much more detailed exploration of the user experience. Methods are introduced for extracting meaningful attributes and aggregates from those logs which are subsequently used as input for predictive models. Activity both prior and during the application process is included for analysis though it is shown that there is sufficient information in the application events alone to accurately detect issues for a subset of the sample using a variant of Markov Chain modelling. An ordinal representation of click patterns is also developed based on the order in which the users encounters them. This proves to a valuable descriptor of user progression. A suite of binary classifiers was reviewed with respect to the problem of predicting a user about to exit the process prematurely. Of these ensemble models were able to produce the best results by utilising membership functions to determine likelihood of class membership.

This document is currently not available here.