Document Type



Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence



Publication Details

IMVIP 2019: Irish Machine Vision & Image Processing, Technological University Dublin, Dublin, Ireland, August 28-30.


A variety of systems focus on detecting the actions and activities performed by humans, such as video surveillance and health monitoring systems. However, published labelled human action datasets for training supervised machine learning models are limited in number and expensive to produce. The use of transfer learning for the task of action recognition can help to address this issue by transferring or re-using the knowledge of existing trained models, in combination with minimal training data from the new target domain. Our focus in this paper is an investigation of video feature representations and machine learning algorithms for transfer learning for the task of action recognition in videos in a multi-class environment. Using four labelled datasets from the human action domain, we apply two SVM-based transfer-learning algorithms: adaptive support vector machine (A-SVM) and projective model transfer SVM (PMT-SVM). For feature representations, we compare the performance of two widely used video feature representations: space-time interest points (STIP) with Histograms of Oriented Gradients (HOG) and Histograms of Optical Flow (HOF), and improved dense trajectory (iDT) to explore which feature is more suitable for action recognition from videos using transfer learning. Our results show that A-SVM and PMT-SVM can help transfer action knowledge across multiple datasets with limited labelled training data; A-SVM outperforms PMT-SVM when the target dataset is derived from realistic non-lab environments; iDT has a greater ability to perform transfer learning in action recognition.