№2, 2022

INVESTIGATION OF CLUSTERING AND CLASSIFICATION METHODS FOR INTELLECTUAL ANALYSIS OF LOG FILES

Babak R. Nabiyev, Fuad I. Ahmadov

Today, the application of information technology in all areas of our lives has led to wider spread and popularity of cybercrime. In modern industrial control systems and cyber-physical systems, log files are very important in terms of detecting cyber incidents, identifying and preventing threats and anomalies. However, today, a large volume of log files generated in these systems greatly complicates the process of extracting useful information from them. This, in turn, highlights the need for intellectual analysis of log files. To this end, this article explores a number of clustering and classification methods and algorithms for the intellectual analysis of log files. Thus, K-means, CURE, EM, kNN, Naive Bayes and DT algorithms are selected out of these algorithms and their working principle is studied, explained, and the application of each algorithm on KDD CUP 99 data set is studied and compared (pp.48-60).

Keywords: Log file, K-means, CURE algorithm, Naive Bayes, EM algorithm, kNN, Decision tree, KDD CUP 99
References