Document Type



Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence



Publication Details

A dissertation submitted in partial fulfilment of the requirements of Dublin Institute of Technology for the degree of M.Sc. in Computing (Data Analytics), 2016.


The motivation for this dissertation is rooted in a real business need. The Property Registration Authority is the state organisation tasked with maintaining a register of land ownership on the island of Ireland. The PRA currently faces a series of challenges; a high level of staff retiring and the inherent loss of knowledge associated with this trend, a lack of recruitment in recent years and a large increase in lodgement of applications for first registration as a result of legislation. The organisation therefore requires a reliable system for predicting future intake. Prior to this project, there has also been a lack of understanding of the factors that influence intake, and that go much of the way to explaining the peaks and troughs in intake levels that have been seen over recent years. Therefore, this dissertation seeks to identify the factors that influence intake of applications for first registration, and to ascertain if these features may be used to build models to predict future intake.

To answer these questions, an exercise in data analytics has been designed and implemented, following the industry standard CRISP-DM methodology. As part of this process, a review of contemporary literature has been carried out, on the subjects of the Irish property market, the factors that influence the level of demand for registration, and modelling approaches applied to variable selection and predictive modelling. Using the insights gleaned from this research, a varied dataset has been sourced, assembled, explored and prepared which includes property registration data, house sale data and economic indicator data. The final dataset has been used to build a series of predictive models, and after evaluation the results show the Random Forest model to be the most effective. A further finding is that the combined outcome of all of the models indicates that the number of houses sold is the single most important factor in predicating volume of applications for registration.

The series of experiments conducted and the body of research analysed have presented several valuable insights into the housing market and the factors that influence it, the modelling techniques that can be applied to intake prediction and the key prediction factors that influence intake. The overall conclusion of the study is that the null hypothesis has been rejected and that intake of ‘first registration’ applications in the Property Registration Authority can be predicted through analysis of historical intake data and external factors. However, it is acknowledged that further work will be required to develop a data gathering and analysis process that can be operationalised in the PRA context.