Document Type

Theses, Masters

Master Thesis

Master thesis


Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence


Computer Sciences

Publication Details

Thesis submitted for the degree of Master of Philosophy, School of Enterprise Computing and Digital Transformation, Technological University Dublin, January 2023.


Programming has become an important skill in today’s world and is taught widely both in traditional settings and online. Instructors need to assess increasing amounts of student work. Unit testing can contribute to the automation of the grading process; however, it cannot assess the structures, style and partially correct source code or differentiate between levels of achievement. The topic of this thesis is an investigation into the use of machine learning methods for assessing the correctness and quality of code, with the ultimate goal of assisting instructors in the grading process. In this research, we have used nine different machine learning algorithms, applied to three distinct types of feature sets, created from over five hundred thousand student code submissions. Prediction scores for some of the models show that the content of the submissions can be assessed in an automated manner. Along with unit testing, this approach has the potential to give instructors a source code-based automated way of assigning more finely differentiated grades than is possible by unit testing alone. This dissertation reports on several findings that confirm the validity of using machine learning, with features derived from source code tokens, for the evaluation of computer program correctness. Further, it shows how this approach has the potential to contribute to automated assessment with multi-valued measures of quality (grading schemes), in contrast to the binary pass/fail measure associated with unit testing.




Document Type

Master thesis