Conference papers

Sweetening the Dataset : Using Active Learning to Label Unlabelled Datasets

Rong Hu, Technological University Dublin
Brian Mac Namee, Technological University DublinFollow
Sarah Jane Delany, Technological University DublinFollow

Document Type

Article

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Publication Details

In Proceedings of the 19th. Irish Conference on Artificial Intelligence and Cognitive Science (AICS '08), 2008.

Abstract

Supervised machine learning approaches assume the existence of a large collection of manually labelled examples of the problem under consideration. However, in many cases such a collection does not exist and creating one is time consuming and expensive. This can be a barrier to the use of supervised learning in certain situations, particularly when the doubt as to whether the system will work or not makes the cost of creating a dataset unjustifable. Active learning is a machine learning technique that has been used widely to create classification systems in the absence of large numbers of labelled examples, but that can also be used to create such collections. This paper will describe a system that uses active learning to label large collections of unlabelled data. We will show that the system can create an accurately labelled dataset aproximately 10 times the size of the set of examples manually labelled by an expert. The experiments described are based on recipe data from the 1st Computer Cooking Contest to be held at ECCBR'08 and focus on identifying those recipes in the set that are desserts.

DOI

https://doi.org/10.21427/9w8z-hc83

Recommended Citation

Hu, R., Mac Namee, B. & Delany, S.J. (2008) Sweetening the Data Set : Using Active Learning to Label Unlabelled Datasets. Proceedings of the 19th. Irish Conference on Artificial Intelligence and Cognitive Science (AICS '08) UCC, Cork. doi:10.21427/9w8z-hc83

Download

Included in

Engineering Commons

COinS

Conference papers

Sweetening the Dataset : Using Active Learning to Label Unlabelled Datasets

Document Type

Rights

Publication Details

Abstract

DOI

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Conference papers

Sweetening the Dataset : Using Active Learning to Label Unlabelled Datasets

Authors

Document Type

Rights

Publication Details

Abstract

DOI

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links