№2, 2022

COMPARATIVE ANALYSIS OF CLUSTER VALIDITY INDICES IN TERMS OF CONSISTENCY

Leyla R. Mammadova

Cluster analysis is one of the key issues in Data Mining, the most important stage of the Knowledge Discovery from Data (KDD) process, and is widely used. There are 3 main tasks of cluster analysis: determining the optimal number of clusters, clustering algorithms and evaluating the quality of clustering. One of the most important steps in cluster analysis is to evaluate the quality of clustering. A number of indices have been proposed to assess the outcome of clustering. The analysis shows that these indices, which are used to assess the quality of clustering, often show inconsistent results. Therefore, extensive research has recently been conducted on the study of indices and new indices are proposed. The article examines a number of internal and external evaluation indices. Different size data sets are taken and k-means, k-medoids, agglomerative hierarchical, BIRCH and OPTICS algorithms are applied to them. A number of internal and external evaluation indices are used to assess the results of the experiment, and the results are analyzed comparatively. Experiments show that Ac, Pr, Rc and F-m indices show similar results in in group determining in a given clustering structure (pp.24-39).

Keywords: Cluster analysis, Clustering algorithms, Evaluation indices
References