№2, 2010

THE ANALYSIS AND CLASSIFICATION OF THE COMPUTER NETWORKS TRAFFIC

Shikhaliyev R.H.

The paper is devoted to analysis of network traffic and modeling of its classification which are importent for computer networks monitoring. For modeling of network traffic classification, an unsupervised machine training method is proposed where k-means clusterization algorithm is used. (p. 15-23)

Keywords: network traffic, clusterization, k-means algorithm.
References
  • C. Logg and L.Cottrell, Characterization of the Traffic between SLAC and the Internet, July 2003. 
    http://www.slac.stanford.edu/comp/net/slac-netflow/html/SLAC-netflow.html.
  • M.L.Bailey, B.Gopal, M.A.Pagels, L.L.Peterson, and P. Sarkar, PathFinder: A pattern-based packet classifier. Proceedings of the First Symposium on Operating Systems Design and Implementation, November 1994. pp.115–123.
  • P. Gupta and N.McKeown, Algorithms for packet classification // IEEE Network Magazine. 2001, vol. 15, no. 2, pp.24–32.
  • G.Szabo, D.Orincsay, S.Malomsoky, and I.Szabo, On the validation of traffic classification algorithms // Proceedings of the 9th International Passive and Active Measurement conference, April 29–30, 2008, pp.72–81.
  • B.C.Park, Y.J.Win, M.S.Kim, and J.W.Hong, Towards automated application signature generation for traffic identification //NOMS: Network operations and management symposium, Salvador, Bahia, Brazil,7–11 April 2008, pp.160–167.
  • H. Kim, M.Fomenkov, D.Barman, M.Faloutsos, and K.Lee, Internet traffic classification demystified: myths, Caveats, and the Best Practices // Proceedings of the 4th Conference on Emerging Network Experiment and Technology, December 09–12, 2008, pp.112–124.
  • S. Sen, O.Spatscheck, and D.Wang, Accurate, scalable In-network identification of P2P traffic using application signatures // Proceedings of the 13th International conference on World Wide Web. New York, USA, May 17–20, 2004. pp.512–521.
  • L7-filter, Application layer packet classifier for linux, 2009 -
    http://l7-filter.sourceforge.net/ (accessed 2009-04-02).
  • W.Moore and D.Papagiannaki, Toward the Accurate Identification of Network Applications // In Proceedings of the Sixth Passive and Active Measurement Workshop, March 31 – April 1, 2005, pp.41–54.
  • W. Li, M.Canini, A.W.Moore and R.Bolla,  Efficient application identification and the temporal and spatial stability of classification schema // Computer Networks, 2009, vol. 53, # 6, pp.790–809.
  • A.W.Moore and D.Zuev, Internet traffic classification using bayesian analysis techniques // Proceeding of the Conference on Measurement and Modeling of Computer Systems, Banff, Alberta, Canada, June 06-10, 2005, pp.50-60.
  • N.J. Nilsson, Introduction to Machine Learning http://robotics.stanford.edu/people/nilsson/MLDraftBook/MLBOOK.pdf, accessed September 2009.
  • L. Zhanh and J.Tang, Characterization and performance study of IP  traffic in WDM networks // Computer communications, 2001, No.24, pp.1702–1713.
  • A. Feldmann, Characteristics of TCP connection arrivals Technical report of the AT&T Labs Research, 1998.
  • R. Caceras, P.Danzig, S.Jamin, and D.Mitzel, Characteristics of Wide-Area TCP/IP Conversations, ACM SIGCOMM, 1991.
  • IANA, http://www.iana.org/assignments/port-numbers (as of August 2005).
  • V. Paxson, Empirically derived analytic models of wide-area TCP connections, IEEE/ACM Trans. Netw., 1994, vol. 2, no. 4, pp.316–336.
  • V. Paxson and S.Floyd, Wide area traffic: the failure of Poisson modeling, IEEE/ACM Trans. Netw., 1995, vol. 3, no. 3, pp.226–244.
  • T. Karagiannis, K.Papagiannaki, and M.Faloutsos, BLINC: multilevel traffic classification in the dark // Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Pomputer Communications, New York, USA, 2005, pp.229–240.
  • CAIDA Homepage, http://www.caida.org.
  • I.D. Graham, S.F.Donnelly, S.Artin, J.Martens, and J.G.Cleary, Non intrusive and Accurate Measurement of Unidirectional Delay and Delay Variation on the Internet // Proceedings of the Internet Society's 8th Annual Networking Conference, Geneva, July 21–24, 1998,
  • B. Huffaker, M.Fomenkov, D.Moore, E.Nemeth, and K.Claffy, Measurements of the Internet topology in Asia-pacific Region, 2000, http://www.caida.org/outreach/papers/asia_paper/
  • M. Dunham, Data Mining: Introductory and Advance Topics. Prentice Hall, New Jersey, 1st edition, 2003.
  • J. Erman, A.Mahanti, and M.Arlitt, Internet Traffic Identification using Machine Learning. In GLOBECOM'06, San Francisco, USA, November, 2006.
  • A.K.Jain, M.N.Murty, and P.J.Flynn, Data Clustering: A Review // ACM Computing Surveys, 1999, vol.31, # 3, pp.254–323.
  • D.Jiang, C.Tang, and A.Zhang, Cluster Analysis for Gene Expression Data: A Survey // IEEE Transactions On Knowledge And Data Engineering, vol. 16, #12,  December 2004, pp.1370–1386.