Conference papers

Is it worth it? Budget-related evaluation metrics for model selection

Filip Klubicka, Technological University DublinFollow
Giancarlo Salton, Technological University DublinFollow
John D. Kelleher, Technological University DublinFollow

Document Type

Conference Paper

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

Computer Sciences, Information Science, Linguistics

Publication Details

Published in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Abstract

Projects that set out to create a linguistic resource often do so by using a machine learning model that pre-annotates or filters the content that goes through to a human annotator, before going into the final version of the resource. However, available budgets are often limited, and the amount of data that is available exceeds the amount of annotation that can be done. Thus, in order to optimize the benefit from the invested human work, we argue that the decision on which predictive model one should employ depends not only on generalized evaluation metrics, such as accuracy and F-score, but also on the gain metric. The rationale is that, the model with the highest F-score may not necessarily have the best separation and sequencing of predicted classes, thus leading to the investment of more time and/or money on annotating false positives, yielding zero improvement of the linguistic resource. We exemplify our point with a case study, using real data from a task of building a verb-noun idiom dictionary. We show that in our scenario, given the choice of three systems with varying F-scores, the system with the highest F-score does not yield the highest profits. In other words, we show that the cost-benefit trade off can be more favorable if a system with a lower F-score is employed.

DOI

https://doi.org/10.21427/8yb2-0f25

Recommended Citation

Klubicka, F., Salton J. & Kelleher, G. (2018). Is it worth it? Budget-related evaluation metrics for model selection, Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Mayazaki, Japan, May 2018. doi:10.21427/8yb2-0f25

Funder

SFI

Download

Included in

Computational Engineering Commons, Digital Humanities Commons, Other Computer Engineering Commons

COinS

Conference papers

Is it worth it? Budget-related evaluation metrics for model selection

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Funder

Included in

Search

Browse

Author Corner

Links

Conference papers

Is it worth it? Budget-related evaluation metrics for model selection

Authors

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Funder

Included in

Share

Search

Browse

Author Corner

Links