№2, 2014

"BIG DATA" PHENOMENON: CHALLENGES AND OPPORTUNITIES

Rasim M. Alguliyev, Makrufa S. Hajirahimova

This paper is devoted to Big Data” phenomenon. It explores the term of "Big Data", opportunities, challenges and existing approaches of this technology. 3V conception and the tasks of big data mining are analyzed. We also analyzed the existing software and hardware products in the implementation of this conception. (pp. 3-16)

Keywords: big data, data science, big data analytics, NoSQL, MapReduce, Hadoop, OLAP
References
  • Worldwide Big Data Technology and Services 2013–2017 Forecast, (http://www.idc.com)
  • Big data: The next frontier for innovation, competition, and productivity. Analyst report, McKinsey Global Institute, May 2011. http://www.mckinsey.com/
  • The digital universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East. Study report, IDC, December 2012. www.emc.com/leadership/digital-universe/
  • Beyer M. A. and Laney D. The importance of big data: A definition. Stamford, CT: Gartner, 2012.
  • Diebold F. On the Origin(s) and Development of the Term "Big Data". Pier working paper archive, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania, 2012.
  • Lohr S. The Origins of ‘Big Data’: An Etymological Detective Story . http://bits.blogs.nytimes.com/2013/
  • Diebold F. Big Data Dynamic Factor Models for Macroeconomic Measurement and Forecasting. / Discussion Read to the Eighth World Congress of the Econometric Society, 2000.
  • Clifford L. Big data: How do your data grow? // Nature, 2008, vol.455, pp.28–29.
  • Google Trends for Big Data, 2013.
  • Is Data The New Oil? http://www.forbes.com/sites/perryrotella/2012/04/02/
  • Data Is the New Oil of the Digital Economy. http://www.wired.com/2014/07/
  • Big Data, Big Impact: New Possibilities for International Development, 2012. weforum.org
  • Moore's law applied to big data. http://www.datasciencecentral.com/forum/
  • Big Data: Big today, normal tomorrow, ITU-T Technology Watch Report, November 2013.
  • https://amplab.cs.berkeley.edu/
  • NIST Big Data Working Group (NBD-WG). http://bigdatawg.nist.gov/home.php.
  • Madden S. From Databases to Big Data // IEEE Internet Computing, 2012, vol.16, issue 3, pp.4–6.
  • Witt D., Gray J. Parallel Database Systems: The Future of High Performance Database Systems // Communications of the ACM, 1992, 35(6), pp. 85–98.
  • Laney D. 3D Data Management: Controlling Data Volume, Velocity and Variety. Technical report, META Group, Inc (now Gartner, Inc.), February 2001. http://blogs.gartner.com/
  • Ward J.S. and Barker A. Undefined By Data: A Survey of Big Data Definitions. http://arxiv.org/pdf/1309.5821.pdf
  • What is big data? - Bringing big data to the enterprise, 2013. http://www-01.ibm.com/
  • Soares S. Big Data Governance - An Emerging Imperative. MC Press Online, LLC, 1st edition, 2012.
  • Chen J., Chen Y., Xiaoyong D., et.all. Big data challenge: a data management perspective // Frontiers of Computer Science in China, 2013, 7(2), pp.157–164.
  • Dean J., Ghemawat S. MapReduce: Simplified Data Processing on Large Clusters/ Proceedings of the Sixth Symposium on Operating System Design and Implementation, volume 6 of OSDI ’04, Berkeley, CA, USA, 2004, pp.137–150.
  • Ghemawat S., Gobioff H. and Leung S.T. The Google File System / Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP ’03, New York, USA, October 2003, pp.29–43.
  • Hadoop, http://hadoop.apache.org/
  • Hadoop MapReduce. http://hadoop.apache.org/docs/stable/mapred_tutorial.html
  • Hadoop Distributed File System. http://hadoop.apache.org/docs/
  • Big Data Research and Development Initiative. whitehouse.gov/
  • InfoSphere Platform: Big Data Analytics, 2013, http://www-01.ibm.com/software/
  • Oracle and Big Data: Big Data for the Enterprise, 2013, http://www.oracle.com/
  • Big Data, 2013, http://www.microsoft.com/
  • Big Data - What Is It? 2013, http://www.sas.com/big-data/
  • SAP HANA integrates predictive analytics, text and big data in a single package, 2013, http://www54.sap.com/
  • Big Data Solutions, 2013, http://www8.hp.com/
  • Stonebraker M. Errors in Database Systems, Eventual Consistency, and the CAP Theorem // Communications of the ACM, April, 2010.
  • Agrawal D., Das S., Amr El Abbadi. Big Data and Cloud Computing: Current State and Future Opportunities / EDBT, march 22–24, 2011, Uppsala, Sweden.
  • UN Global Pulse. http://www.unglobalpulse.org.
  • Черняк Л. Большие Данные – новая теория и практика. М.: Открытые системы, 2011, №10.
  • Алгулиев Р.М., Фаталиев Т.Х., Гаджирагимова М.Ш. К созданию корпоративной распределенной архивной системы // Известия НАНА, 2003, №3, с.143–147.
  • Menon J., Treiber K. Daisy: A Virtual-disk Hierarchical storage Manager, Performance Evaluation Review, 25(3), December 1997, pp.37–44.
  • Chen Y. “Information Valuation for Information Lifecycle Management” / Proceedings of Autonomic Computing, June 2005, pp.135–146.
  • Foster Y., Kesselman C., Tuecke S. The Anatomy of the Grid: Enabling Scalable Virtual Organizations // Intern. J. of High Performance Computing Applications, 2001, 15(3), 200–222, www.globus.org
  • McAfee A. and Brynjolfsson E. Big Data: The Management Revolution. Harvard Business Review, 2012, vol.90, no.10, pp.60–68.
  • Селезнев К. Проблемы анализа больших данных // Открытые системы, 2012, №7, с.25–29.
  • Fan W., Bifet A. Mining Big Data: Current Status, and Forecast to the Future / SIGKDD, vol.14, issue 2, pp.1–5.
  • Mayer-Schönberger V. and Cukier K. Big Data - A Revolution That Will Transform How We Live, Work and Think. John Murray (Publishers), 2013.
  • Szala A., Gray J. 2020 Computing: Science in an exponential world // Nature, 2006, 440, pp.413–414.
  • Anderson C. The End of Theory: The Data Deluge Makes the Scientific Method Obsolete // Wired Magazine, July 2008. http://www.wired.com/science/
  • Bakshi, K.Considerations for big data: Architecture and approach / Proceedings of the IEEE Aerospace Conference, 3-10 march, 2012, pp.1–7.
  • Fayyad U. Big Data Analytics: Applications and Opportunities in On-line Predictive Modeling, 2012. http://big-data-mining.org/keynotes/
  • Черняк Л. Вычисления с акцентом на данные // Открытые системы, 2008, №8, с.36–39.
  • Zhang J., Huang M. L. 5Ws Model for Big Data Analysis and Visualization / Proceedings of the IEEE 16th International Conference on Computational Science and Engineering (CSE), 2013, pp.1021–1028.
  • Siba F.N., Mohammad S., Kidwai H.K., Qamar B., Awwad F. Parallel Implementation and Performance Analysis of a 3D Oil Reservoir Data Visualization Tool on the Cell Broadband Engine and CUDA GPU / Proceedings of the 14th International Conference on High Performance Computing and Communication & 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012, pp.970–975.
  • Wu X., Zhu X., Wu G.Q., Ding W. Data mining with bigdata // IEEE Transactionson Knowledge and Data Engineering, 2014, vol.26, issue 1, pp.97–107.
  • Big Data Market Size and Vendor Revenues. http://wikibon.org/wiki/