Application of Cluster Analysis to Segment the Home Heating Customers of an Oil Distribution Business in Ireland

Niall Ennis, Dublin Institute of Technology

Document Type Theses, Masters

Successfully submitted in partial fulfilment of the requirements of Dublin Institute of Technology for the degree of MSc. in Computing (Data Analytics) 2016.


Oil makes up the largest component of the home heating market in Ireland. The oil market is highly competitive with little opportunity to differentiate on price or product. Company A is a large importer of oil into Ireland and it supplies the home heat market. To gain a competitive advantage Company A wants to develop a targeted marketing strategy. The major challenge in implementing this strategy is the lack of specific knowledge about the customers in the market. An experiment was designed to identify the key customer segments. Three datasets were sourced which included transactional data, demographic data and weather data. The datasets were integrated and cleansed, the attributes were reduced through principal component analysis (PCA) and finally the data was aggregated. K-means clustering was applied to the dataset. The cluster solution was validated and the clusters found were profiled. The cluster centroids were decomposed back to their original attributes in terms of labels and scale as the original information provided a deeper understanding of the segments. The validity of the cluster solution was further supported by the successful implementation of a decision tree experiment. The algorithm identified observations that were on the border of two clusters which were only assigned to a cluster based on the hard clustering approach in k-means. Fourteen useable segments were produced. The domain experts in Company A reviewed the attributes in each segment as well as the overall segmentation solution. They found that the segments provide them with knowledge and an opportunity to develop targeted strategies. There is now “science” in the development of marketing strategies and knowledge has been discovered in the databases of Company A.