Document Type
Conference Paper
Rights
Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
Disciplines
Computer Sciences, Information Science
Abstract
Silhouette is one of the most popular and effective internal measures for the evaluation of clustering validity. Simplified Silhouette is a computationally simplified version of Silhouette. However, to date Simplified Silhouette has not been systematically analysed in a specific clustering algorithm. This paper analyses the application of Simplified Silhouette to the evaluation of k-means clustering validity and compares it with the k-means Cost Function and the original Silhouette from both theoretical and empirical perspectives. The theoretical analysis shows that Simplified Silhouette has a mathematical relationship with both the k-means Cost Function and the original Silhouette, while empirically, we show that it has comparative performances with the original Silhouette, but is much faster in calculation. Based on our analysis, we conclude that for a given dataset the k-means Cost Function is still the most valid and efficient measure in the evaluation of the validity of k-means clustering with the same k value, but that Simplified Silhouette is more suitable than the original Silhouette in the selection of the best result from k-means clustering with different k values.
DOI
https://doi.org/10.1007/978-3-319-62416-7_21
Recommended Citation
Franco-Penya, H. et al. (2017) An Analysis of the Application of Simplified Silhouette to the Evaluation of k-means Clustering Validity. 13th International Conference on Machine Learning and Data Mining MLDM 2017, July 15-20, 2017, New York, USA.
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Included in
Analysis Commons, Artificial Intelligence and Robotics Commons, Other Computer Sciences Commons, Theory and Algorithms Commons
Publication Details
13th International Conference on Machine Learning and Data Mining MLDM 2017, July 15-20, 2017, New York, USA