Document Type

Theses, Masters


This item is available under a Creative Commons License for non-commercial use only



Publication Details

A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computing (Data Analytics) 16 June 2019


Basketball teams at all levels of the game invest a considerable amount of time and effort into collecting, segmenting, and analysing footage from their upcoming opponents previous games. This analysis helps teams identify and exploit the potential weaknesses of their opponents and is commonly cited as one of the key elements required to achieve success in the modern game. The growing importance of this type of analysis has prompted research into the application of computer vision and audio classification techniques to help teams classify scoring sequences and key events using game footage. However, this research tends to focus on classifying scenes based on information from a single sensory source (visual or audio), and fails to analyse the wealth of multi-sensory information available within the footage. This dissertation aims to demonstrate that by analysing the full range of audio and visual features contained in broadcast game footage through a multi-sensory deep learning architecture one can create a more effective key scene classification system when compared to a single sense model. Additionally, this dissertation explores the performance impact of training the audio component of a multi-sensory architecture using different representations of the audio features.