Document Type
Dissertation
Rights
This item is available under a Creative Commons License for non-commercial use only
Disciplines
Computer Sciences
Abstract
Vast amounts of sound data are transmitted every second over digital networks. VoIP services and cellular networks transmit speech data in increasingly greater volumes. Objective sound quality models provide an essential function to measure the quality of this data in real-time. However, these models can suffer from a lack of accuracy with various degradations over networks. This research uses machine learning techniques to create one support vector regression and three neural network mapping models for use with ViSQOLAudio. Each of the mapping models (including ViSQOL and ViSQOLAudio) are tested against two separate speech datasets in order to comparatively study accuracy results. Despite the slight cost in positive linear correlation and slight increase in error rate, the study finds that a neural network mapping model with ViSQOLAudio provides the highest levels of accuracy in objective speech quality measurement. In some cases, the accuracy levels can be over double that of ViSQOL. The research demonstrates that ViSQOLAudio can be altered to provide an objective speech quality metric greater than that of ViSQOL.
Recommended Citation
McNally, J. (2017) Towards improving ViSQOL (Virtual Speech Quality Objective Listener) Using Machine Learning Techniques, Dissertation M.Sc. in Computing, (Advanced Software Development) DIT, 2017.
Publication Details
A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computing (Advanced Software Development) July 2017.