Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
Statistics, Computer Sciences
Labels representing value judgements are commonly elicited using an interval scale of absolute values. Data collected in such a manner is not always reliable. Psychologists have long recognized a number of biases to which many human raters are prone, and which result in disagreement among raters as to the true gold standard rating of any particular object. We hypothesize that the issues arising from rater bias may be mitigated by treating the data received as an ordered set of preferences rather than a collection of absolute values. We experiment on real-world and artificially generated data, finding that treating label ratings as ordinal, rather than interval data results in an increased inter-rater reliability. This finding has the potential to improve the efficiency of data collection for applications such as Top-N recommender systems; where we are primarily interested in the ranked order of items, rather than the absolute scores which they have been assigned.
O'Neill, J., Delaney, S.J. & MacNamee, B. (2017). Rating by ranking: An improved scale for judgement-based labels. RecSys '17 Proceedings of the Eleventh ACM Conference on Recommender Systemspg. 384-385, Como, Italy, August 27 - 31. doi:10.1145/3109859.3109961