Author ORCID Identifier
1.2 COMPUTER AND INFORMATION SCIENCE, Computer Sciences
Programming has become an important skill in today’s world and is taught widely both in traditional and online settings. Instructors need to grade increasing amounts of student work. Unit testing can contribute to the automation of the grading process but it cannot assess the structure or partial correctness of code, which is needed for finely differentiated grading. This paper builds on previous research that investigated machine learning models for determining the correctness of programs from token-based features of source code and found that some such models can be successful in classifying source code with respect to whether it passes unit tests. This paper makes two further contributions. First, these results are scrutinized under conditions of varying similarity between code instances used for model training and testing, for a better understanding of how well the models generalize. It was found that the models do not generalize outside of groups of code instances performing very similar tasks (corresponding to similar coding assignments). Second, selected binary classification models are used as a base for multi-class prediction with two different methods. Both of these exhibit prediction success well above the random baseline, with potential to contribute to automated assessment with multi-valued measures of quality (grading schemes), in contrast to the binary pass/fail measure associated with unit testing.
Tarcsay, Botond; Perez-Tellez, Fernando; and Vasic, Jelena, "Using Machine Learning to Identify Patterns in Learner-Submitted Code for the Purpose of Assessment" (2023). Conference papers. 401.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.