AZERBAIJAN NATIONAL ACADEMY OF SCIENCES
ABOUT THE METHODS OF COLLECTING, STORING AND ANALYZING BIG NETWORK TRAFFIC
Ramiz H. Shikhaliyev

Collecting and storing network traffic of computer networks (CNs) is one of the major stages of the monitoring process. However, collecting and maintaining full network traffic in the modern CNs is a very complex problem. With rising speed and scale of the CNs and network traffic size, petabytes of storage might be needed for a day. There are various methods for network data collecting and storing. Their correct choice can significantly reduce collected data size and, respectively, the required storage size. The article examines the issues of network data collection and storage with the use of Big Data technology (рр.48-52).

Keywords: computer network, monitoring, network traffic, network traffic collection, network traffic storage, analysis of network traffic, Big Data technology
DOI : 10.25045/jpit.v07.i2.06
References
  • Shikhaliyev R.H. About the methods and tools for Computer networks monitoring //Problems of Information Society, 2011, No2, pp.61-70,.
  • Shikhaliyev R.G. About the method for reducing the dimension of the analyzed features of network traffic used to monitor computer networks // Telecommunications. - No6. pp. 44-48, 2011.
  • Alguliyev R.M., Hajirahimova M.S. “Big data phenomenon: Challenges and Opportunities” // Problems of Information Society, 2014, No2, pp. 3-16.
  • InfoSphere Platform: Big Data Analytics, 2013, http://www-01.ibm.com/software/
  • Oracle and Big Data: Big Data for the Enterprise, 2013, http://www.oracle.com/
  • Big Data, 2013, http://www.microsoft.com/
  • Big Data – What Is It? 2013, http://www.sas.com/big-data/
  • SAP HANA integrates predictive analytics, text and big data in a single package, 2013, http://www54.sap.com/
  • Big Data Solutions, 2013, http://www8.hp.com/
  • Bejtlich R. Why Collect Full Content Data?, http://taosecurity.blogspot.com, 2012
  • Quittek , Zseby T., Claise B., Zander S., RFC 3917: Requirements for IP Flow Information Export (IPFIX). Internet Engineering Task Force, 2004. http://tools.ietf.org/html/rfc3917
  • RFC 7011, Specification of the IP Flow Information Export (IPFIX) Protocol, a standardized network flow format, provides a more technical definition of flow. http://tools.ietf.org/search/rfc7011
  • National Information Standards Organization (NISO). Understanding Metadata. NISO, 2004.
  • Aceto G., Botta A., Pescape , Westphal C. Efficient Storage and Processing of High-Volume Network Monitoring Data // IEEE Transactions on Network and Service Management, 2013, vol. 10, no. 2, pp. 162–175.
  • Aceto G., Botta A., de Donato W., Pescape A. Cloud Monitoring: A Survey // Computer Networks, 2013, vol.57, no.9, pp. 2093–2115.
  • Deri L., Cardigliano A., Fusco F. 10 Gbit Line Rate Packet-to-Disk Using n2disk / Proceedings IEEE INFOCOM, 2013, pp. 3399–3404.
  • Banks D. Custom Full Packet Capture System, SANS, 2013.
  • Francois J. State R., Engel T. Aggregated Representations and Metrics for Scalable Flow Analysis / IEEE Conference on Communications and Network Security (CNS), 2013, pp. 478–482.
  • Sivashakthi T., Prabakaran N. A Survey on Storage Techniques in Cloud Computing // International Journal of Emerging Technology and Advanced Engineering, 2013, vol. 3, no.12, pp. 125–128.
  • Spoorthy V., Mamatha M., Santhosh Kumar B. A Survey on Data Storage and Security in Cloud Computing / International Journal of Computer Science and Mobile Computing, 2014, vol.3, no.6, pp. 306–313.
  • Software Engineering Institute, Carnegie Mellon University. SiLK FAQ https://tools.netsa.cert.org/silk/faq.html (2014).
  • http://nosql-database.org/
  • Shikhaliyev R.G. Analysis and classification of network traffic of computer networks // Problems of Information Society, 2010, No2, pp.15-23.
  • Hohn N. and Veitch D. Inverting sampled traffic / Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement, 2003, pp. 222–233.
  • Duffield N., Lund C. and Thorup M. Properties and prediction of flow statistics from sampled packet streams / Proceeding of the 2nd ACM SIGCOMM Workshop on Internet measurment, 2002, pp. 159–171.
  • Carlin S, and Curran K. Cloud Computing Technologies // International Journal of Cloud Computing and Services Science (IJ-CLOSER), 2012, vol.1, no.2, pp. 59–65.
  • Hadoop, http://hadoop.apache.org/
  • Dean J., and Ghemawat S. MapReduce: Simplified Data Processing on Large Cluster // Magazine Communications of the ACM, 2008, vol.51 no.1, pp.107–113.
  • https://developer.yahoo.com/hadoop/
  • http://wiki.apache.org/hadoop/AmazonEC2
  • http://borthakur.com/ftp/hadoopmicrosoft.pdf
  • Lee Y., Kang W., Son H. An Internet Traffic Analysis Method with MapReduce / Proceedings of the Network Operations and Management Symposium Workshops (NOMS Wksps), 2010 IEEE/IFIP, 2010, pp. 357–361.
  • Lee Y., and Lee Y. Toward Scalable Internet Traffic Measurement and Analysis with Hadoop // ACM SIGCOMM Computer Communication Review, 2013, vol.43, no.1, pp. 6–13.
  • Shan S., Big data classification: problems and challenges in network intrusion prediction with machine learning / ACM SIGMETRICS Performance Evaluation Review, 2014, vol.41, no.4, pp.70–73.