Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
Within English second language acquisition there is an enthusiasm for using authentic text as learning materials in classroom and online settings. This enthusiasm, however, is tempered by the difficulty in finding authentic texts at suitable levels of comprehension difficulty for specific groups of learners. An automated way to rate the comprehension difficulty of a text would make finding suitable texts a much more manageable task. While readability metrics have been in use for over 50 years now they only capture a small amount of what constitutes comprehension difficulty. In this paper we examine other features of texts that are related to comprehension difficulty and assess their usefulness in building automated prediction models. We investigate readability metrics, vocabulary-based features, and syntax-based features, and show that the best prediction accuracies are possible with a combination of all three.
Mac Namee, B., Kelleher, J. & Fitzpatrick, N. (2017). Assessing the usefulness of different feature sets for predicting the comprehension difficulty of text. Proceedings of the 25th Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2017), Dublin, Ireland, December 7 - 8. Published on CEUR-WS: 13-Apr-2018, vol. 2086.