№1, 2017

CONCEPTUAL BIG DATA ARCHITECTURE FOR THE OIL AND GAS INDUSTRY

Ramiz M. Aliguliyev, Yadigar N. Imamverdiyev

Big data technologies provide important approaches and tools for the creation of data management systems in oil and gas industry. The paper proposes a conceptual architecture for a hybrid Big data platform for storing and analyzing large volumes of data gathered from oil and gas industry systems in real-time by deep analytics and machine learning methods in distributed cluster systems. We also consider the question of selection of necessary tools from Hadoop ecosystem for building of a viable Big data solution (pp.3-13).

Keywords: oil and gas industry, Big Data, Hadoop, Apache Spark, MapReduce, Big Data analytics, Big Data architecture.
DOI : 10.25045/jpit.v08.i1.01
References
  • Campbell C.J., and Laherrère J.H. The end of cheap oil // Scientific American, 1998, vol.278, no.3, pp.78–83.
  • Cross L.R. The technology revolution in oil and gas, 2014. http://www.worldservicesgroup.com/publications.asp?action=article&artid=6496
  • Saputelli L. A., Bravo C., Moricca G., Cramer R., Nikolaou M., Lopez C., Mochizuki S. Best practices and lessons learned after 10 years of Digital Oilfield (DOF) implementations // SPE Paper 167269, SPE Kuwait Oil and Gas Show and Conference, 2013, p.1. http://dx.doi.org/10.2118/167269-MS.
  • Dickens J., Feineman D., & Roberts S. Choices, changes and challenges: lessons for the future development of the Digital Oilfield // Society of Petroleum Engineers. 2012. http://dx.doi.org/10.2118/150173-MS.
  • Feineman D. R. Digital Oilfield Implementation: Learning From the Ghostbusters // Society of Petroleum Engineers, 2014. http://dx.doi.org/10.2118/167831-MS.
  • Holland D. Exploiting the Digital Oilfield: 15 Requirements for Business Value. Xlibris, 2012.
  • Imamverdiyev Y.N. great potential and challenges of Big Data technologies // Problems of information society 2016, No1, pp.23–34.
  • Editorial: Community cleverness required // Nature, 4 September 2008, vol.455, no.7209,1.
  • Alguliyev R.M., Hajirahimova M.S. Big Data phenomenon: Challenges and Opportunities // Problems of Information Technologies, 2014, No2, pp.3-16.
  • Gasimova R.T., Big Data analytics: current approaches, problems and solutions // Problems of Information Technologies, 2016, No1, pp.75–93.
  • Hajirahimova M.S. Opportunities and challenges of big data in oil and gas industry //Proceedings of the National Supercomputer Forum (NSKF 2015), Russia, Pereslavl-Zalesskiy, 24–27 November, 2015.
  • Alguliyev R.M., Imamverdiyev Y.N., Abdullayeva F.J.Studying the opportunities of Big Data analytics for the oil and gas industry cloud computing platform as analytics-as-a-service // Problems of Information Technologies, 2016, No1, pp.11–26.
  • Feblowitz J. The Big Deal about Big Data in upstream oil and gas. IDC Energy Insights. October 2012.
  • Baaziz A., Quoniam L. How to use Big Data technologies to optimize operations in Upstream Petroleum Industry // International Journal of Innovation, 2013, vol.1, no.1, pp.19-29.
  • Sangvai P. Impact of Big Data in oil and gas industry // Proc. of the 10th Biennial International Conference & Exposition, 2013, pp. 439-440.
  • Onajite E. Seismic Data Analysis Techniques in Hydrocarbon Exploration. Elsevier Inc., 2014.
  • Hyne N. Dictionary of Petroleum Exploration, Drilling & Production. 2nd Edition. 2014.
  • Zhang M., Ma X., Wang L., Lai Sh., Hongpu Zhou H., Zhao H., Liao Y. Progress of optical fiber sensors and its application in harsh environment // Photonic Sensors, 2011, vol.1, no.1, pp.84-89.
  • Shi Y., Zhang C., Li R., Cai M., Jia G. Theory and application of magnetic flux leakage pipeline detection // Sensors, 2015, vol.15, pp.31036–31055.
  • Bravo C.E., Saputelli L., Rivas F., Perez A. G., Nickolaou M., Zangl G., De Guzman N., Mohaghegh S., Nunez G. State of the art of artificial intelligence and predictive analytics in the E&P Industry: a technology survey // Society of Petroleum Engineers, 2013. http://dx.doi.org/10.2118/150314-PA.
  • Kamal S. Z., Williams J., Liddle J. Continuous improvement of assets through existing and new digital oilfield technology // Society of Petroleum Engineers, 2014. http://dx.doi.org/10.2118/167908-MS.
  • White T. Hadoop: the definitive guide. O'Reilly Media, Inc., 2012.
  • Dean J., Ghemawat S. MapReduce: simplified data processing on large clusters // Proc. of the 6th Conference on Symposium on Opearting Systems Design & Implementation, 2004, vol.6, pp.137-150.
  • Lee K.H., Lee Y. J., Choi H., Chung Y.D., Moon B. Parallel data processing with MapReduce: a survey // ACM SIGMOD Record, 2012, vol.40, no.4, pp.11-20.
  • Karthik K., Kollias G., Kumar V., Grama A. Trends in Big Data analytics // Journal of Parallel and Distributed Computing, 2014, vol.74, no.7, pp.2561-2573.
  • Fan W., Bifet A. Mining big data: current status, and forecast to the future //ACM SIGKDD Explorations Newsletter, 2013, vol.14, no.2, pp.1-5.
  • Weiss Sh.M., Indurkhya N., Zhang T., Damerau F. Text mining: predictive methods for analyzing unstructured information. Springer; 2005, 260 p.
  • Aliguliyev R.M. A new sentence similarity measure and sentence based extractive technique for automatic text summarization // Expert Systems with Applications, 2009, vol.36, no.4, pp.7764–7772.
  • Alguliev R.M., Aliguliyev R.M., Isazade N.R. Multiple documents summarization based on evolutionary optimization algorithm // Expert Systems with Applications, 2013, vol.40, no.5, pp.1675-1689.
  • Siegel E. Predictive Analytics: The power to predict who will click, buy, lie, or die. Wiley; 1st edition, 2013, 320 p.
  • Mittelstadt S., Behrisch M., Weber S., Schreck T. et al. Visual analytics for the big data era - a comparative review of state-of-the-art commercial systems // Proc. of the IEEE Conference on Visual Analytics Science and Technology, 2012, pp.173-182.
  • Chardonnens T. et al. Big data analytics on high velocity streams: a case study // IEEE International Conference on Big Data, 2013, pp.784-787.
  • Jones M.T. Spark, an alternative for fast data analytics. IBM developerWorks, November 2011.
  • Wang H., Wang H., Liu Y., Yang F. Design and implementation of SOLR-based information retrieval system for value-added service // The Journal of China Universities of Posts and Telecommunications, 2008, vol.15, pp.51–54.
  • Douglas K., Douglas S. PostgreSQL: a comprehensive guide to building, programming, and administering PostgreSQL databases. – SAMS publishing, 2003.
  • Fazelat R. A Comprehensive analysis - data processing part Deux: Apache Spark vs Apache Storm, January 2016. https://www.linkedin.com/pulse/comprehensive-analysis-data-processing-part-deux-apache-fazelat
  • Tian X., Lu G., Zhou X., Li J. Evolution from Shark to Spark SQL: preliminary analysis and qualitative evaluation. Big Data Benchmarks, Performance Optimization, and Emerging Hardware, 2015, pp.67-80.
  • Abadi D., Babu S., Özcan F., Pandis I. SQL-on-hadoop systems: tutorial // Proc. of the VLDB Endowment, 2015, vol.8, no.12, pp.2050-2051.
  • Thusoo A., Sarma J.S., Jain N., Shao Z., Chakka P., Anthony S., Liu H., Wyckoff P., Murthy R. Hive A Warehousing Solution Over a MapReduce Framework // Proc. of the VLDB Endowment, 2009, vol.2, no.2, pp.1626-1629.
  • Saha B., Shah H., Seth S., Vijayaraghavan G., Murthy A., Curino C. Apache Tez: a unifying framework for modeling and building data processing applications // Proc. of the ACM SIGMOD International Conference on Management of Data, 2015, pp.1357-1369.
  • Huai Y., Chauhan A., Gates A., Hagleitner G., Hanson E. N., O'Malley O. , Zhang X. Major technical advancements in Apache Hive // Proc. of the ACM SIGMOD International Conference on Management of Data, 2014, pp.1235-1246.
  • Prasad S., Avinash S.B. Smart meter data analytics using OpenTSDB and Hadoop // Innovative Smart Grid Technologies-Asia, 2013, pp.1-6.
  • Hunkeler U., Truong H.L., Stanford-Clark A. MQTT-S - a publish/subscribe protocol for Wireless Sensor Networks // Proc. of the 3rd IEEE international Conference on Communication Systems Software and Middleware and Workshops, 2008, pp.791-798.