№1, 2020

ANALYSIS OF THE SEARCH ALGORITHMS UTILIZED IN BIG DATA

Rena T. Gasimova, Rahim N. Abbaslı

Digital materials include continuously growing text documents, databases, structured and unstructured image, sound and graphic materials, software and web pages. Increasing pace of the generation of digital information has brought a need to analise the structure of the input files and create relevant and meaningful output faster. The article explores the features of search algorithms, their shortcomings and potential use cases for their application in order to maximize their advantages. It is found that it is necessary to use the algorithms based on artificial intelligence to solve problems associated with improving the quality of the search, increasing the amount of data and the intensity of user queries. The article analyzes the search algorithms, their shortcomings and potential use cases for their application in order to maximize their advantages (pp.98-108).

Keywords: digital heritage, digital data, Big data, Big Data Analytics, search engines, information retrieval, Artificial Intelligence, machine learning.
DOI : 10.25045/jpit.v11.i1.12
References
  • Qasımova R.T. “Rəqəmsal irs: problemlər və perspektivlər”. Ekspress-informasiya. İnformasiya cəmiyyəti seriyası, Bakı: “İnformasiya Texnologiyaları” nəşriyyatı, 2018, 148 s.
  • Хартия о сохранении цифрового наследия // Библиотековедение, 2004, №6, с.40–43.
  • Brian R. Digital Access to Cultural Heritage and Scholarship in the Czech Republic // Slavic & East European Information Resources, 2008, vol.9, no.1, pp.12–29.
  • Tallova L. Copyright aspects of disclosure of works within the European Digital Library / Proceedings of the International Multidisciplinary Scientific Conferences on Social Sciences and Arts, 2014, vol.1, pp.561–568.
  • Qasımov V.Ə. İnformasiya axtarışı üsulları və sistemləri. Dərslik. Bakı: MTN-in Maddi-texniki Təminat Baş İdarəsinin Nəşriyyat-Poliqrafiya Mərkəzi. 2015, 288 s.
  • Reinsel D., Gantz J., Rydning J. Data Age 2025: The Digitization of the World – From Edge to Core, November 2018, IDC White Pape. https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf
  • Рост объема информации – реалии цифровой вселенной // Журнал Технологии и средства связи, №1, 2013, с.24. http://lib.tssonline.ru/articles2/fix-corp/rost-obema-informatsii--realii-tsifrovoy-vselennoy
  • Alguliyev R.M., Gasimova R.T., Abbaslı R.N. The Obstacles in Big Data Process // International Journal of Modern Education and Computer Science (IJMECS), 2017, vol. 9, no.3, pp.28–35. DOI: 10.5815/ijmecs.2017.03.04.
  • Qasımova R.T. Big data analitikasi: mövcud yanaşmalar, problemlər və həllər // İnformasiya Texnologiyaları Problemləri, 2016, №1, s.75–93.
  • Madden S. From Databases to Big Data // IEEE Internet Computing, 2012, vol.16, no.3, pp.4–6.
  • By Research Voicebot and PwC. Smart speaker consumer adoption report. March 2018.  
    https://voicebot.ai/wp-content/uploads/2018/03/smart_speaker_ consumer_ adoption_report_ 2018.pdf
  • A guide to the security of voice-activated smart speakers An ISTR Special Report Analyst: Candid Wueest. https://www.symantec.com/content/dam/symantec/docs/security-center/white-papers/istr-security-voice-activated-smart-speakers-en.pdf
  • Balayev R.Ə., Əlizadə M.N., Musayev İ.K. İntellektual sistemlər və texnologiyalar. Dərs vəsaiti, Bakı: “MSV NƏŞR“ nəşriyyatı, 2016, 256 s.
  • The history of search engines. 
    https://www.wordstream.com/articles/internet-search-engines-history
  • Anderson A., Semmelroth D. Statistics for Big Data For Dummies, 2015, 384 pages, e-book: http://www.dummies.com/programming/big-data/data-science/big-data-and-search-engines/
  • Касумов В.А. Методы информационного поиска в Internet на основе нечетких отношений предпочтения // Автоматика и вычислительная техника, 2003, №4, с.71–78.
  • Касумов В.А. Методы построения информационно-поисковых систем на базе иерархической модели информационного пространства Интернет // Автоматика и вычислительная техника, 2002, №1, с.40–51.
  • Big Data Search Tools. https://datafloq.com/big-data-open-source-tools/os-big-data-search Qiu J., Wu Q., Ding G., Xu Y., Feng S. A survey of machine learning for big data processing // EURASIP Journal on Advances in Signal Processing, 2016, pp.1–16.
  • Aliguliyev R.M. Analysis of hyperlinks and the ant algorithm for calculating the ranks of web pages” // Automatic Control and Computer Sciences, 2007, vol.41, no.1, pp.44–53.
  • Cambazoglu B.B., Aykanat C., Baeza-Yates R. A machine learning approach for result caching in web search engines // International Journal of Information Processing and Management, 2017, vol.53, no.4, pp.834–850.
  • Chen H. Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms // Journal of the American Society for Information, 1995, vol.46, no.3, pp.194–216.
  • Papadakis I., Stefanidakis M., Stamou S., Andreou I. Semantifying queries over large-scale Web search engines // Journal of Internet Services and Applications, 2012, vol.3, no.3, pp.255–268.
  • Meenakshi S. P., Agarwal G., Bakshi S., Bhatter S., Sivakumar P. Cognitive Agents for Web Based Search Engines: A Review / Proceedings of the Second International Conference on Recent Trends and Challenges in Computational Models, 2017.
  • Xiaozhao Z., Peng Z., et al. Modeling multiple interactions with a Markov random field in query expansion for session search // Computational Intelligence, 2018, vol.34, no.1, pp.345–362.
  • Guy I. The characteristics of voice search: comparing spoken with typed-in mobile Web search queries // ACM Transactions on Information Systems, 2018, vol.36, no.3, pp.1–28.
  • Chen Y., Zhang Y.Q. A query substitution-search result refinement approach for long query web searches / Proceedings of the International Joint Conferences On Web Intelligence (Wi) And Intelligent Agent Technologies (Iat), IEEE/WIC/ACM, 2009, vol.1, pp.245–251.
  • Crestani F., Du H. Written versus spoken queries: A qualitative and quantitative comparative analysis // Journal of the American Society for Information Science and Technology, 2006, vol.57, no.7, pp.881–890.