Document Type
Conference Paper
Rights
Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
Abstract
The problem of concept drift has recently received con- siderable attention in machine learning research. One important practical problem where concept drift needs to be addressed is spam filtering. The literature on con- cept drift shows that among the most promising ap- proaches are ensembles and a variety of techniques for ensemble construction has been proposed. In this pa- per we compare the ensemble approach to an alternative lazy learning approach to concept drift whereby a sin- gle case-based classifier for spam filtering keeps itself up-to-date through a case-base maintenance protocol. We present an evaluation that shows that the case-base maintenance approach is more effective than a selection of ensemble techniques. The evaluation is complicated by the overriding importance of False Positives (FPs) in spam filtering. The ensemble approaches can have very good performance on FPs because it is possible to bias an ensemble more strongly away from FPs than it is to bias the single classifer. However this comes at consid- erable cost to the overall accuracy
Recommended Citation
Delany, S.J.,Cunningham, P. & Tysmbal, A. (2006) A comparison of Ensemble and Case-base Maintenance Techniques for Handling Concept Drift in Spam Filtering, In: G.Sutcliffe and R.Goebel (eds.), Proc. 19th Int. Conf. on Artificial Intelligence FLAIRS'2006, AAAI Press, p340-345.
Funder
Enterprise Ireland
Publication Details
In: G.Sutcliffe and R.Goebel (eds.), Proc. 19th Int. Conf. on Artificial Intelligence FLAIRS'2006