revision

№1, 2025

DEVELOPMENT OF A CONCEPTUAL MODEL FOR ENSURING FAULT TOLERANCE OF SOFTWARE SYSTEMS

Tamilla Bayramova

Digital transformation is a comprehensive process of integrating information technology into all areas of human activity. Almost all aspects of our lives, from personal communications to global economic processes, are increasingly integrated with digital technologies. Banking, manufacturing, healthcare, education, and transportation are all increasingly using software to improve efficiency, optimize processes, and provide new services. Modern applications are becoming increasingly complex, consisting of many interconnected components. This increases the likelihood of errors and failures. The relevance of research in the field of software reliability is steadily growing.This article is devoted to the study of the differences and relationships between information and software systems, as well as an in-depth analysis of the key concepts of software reliability and fault tolerance. The main approaches and strategies for ensuring fault tolerance are considered, including redundancy, backup, monitoring, duplication, load balancing, microservices, backup, prediction and detection of errors, as well as their practical application. The aim of the study is to define the principles and methods of self-healing software and to analyze the risks associated with automatic response to failures. To solve the problem of ensuring fault tolerance in dynamic and complex software systems, a conceptual model for designing reliable, self-learning and self-adaptive systems is proposed (pp.35-46).

Keywords: Software system, Software reliability, Fault tolerant, Self-adaptive system, Conceptual model
References
  • Armoush, A., Salewski, F., & Kowalewski, S. (2008). A hybrid fault tolerance method for recovery block with a weak acceptance test. In 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, Shanghai, China (pp. 484-491). https://doi.org/10.1109/EUC.2008.102
  • Bayramova, T. A. & Malikova, N. C. (2024). Developing a conceptual model for improving the software system reliability. Problems of Information Society, 15(1), 42-56. http://doi.org/10.25045/jpis.v15.i1.05
  • Bayramova, T. A. (2015). The importance of self-management mechanisms to ensure software safety. International Research Journal of Engineering and Technology, 2(3), 1758-1761.
  • Bernstein, L. (2003). Software fault tolerance forestalls crashes: To err is human; to forgive is fault tolerant. Advances in Computers, 58, 239-286. https://doi.org/10.1016/S0065-2458(03)58006-8
  • Chen, L. & Avizienis, A. (1978). N-version programming: A fault-tolerance approach to reliability of software operation. In 8th IEEE International Symposium on Fault-Tolerant Computing (FTCS-8) (pp. 3-9).
  • Conti, M., Schunter, M., & Askoxylakis, I. (2015). Trust and Trustworthy Computing. In 8th International Conference, TRUST 2015, Heraklion, Greece (pp.300-309). https://doi.org/10.1007/978-3-319-22846-4
  • de Lemos, R. (2009). On architecting software fault tolerance using abstractions. Electronic Notes in Theoretical Computer Science, 236, 21-32. https://doi.org/10.1016/j.entcs.2009.03.012
  • de Souza, K. E., Ferrari, F. C., de Camargo, V. V., Ribeiro, M., & Offutt, J. (2025) A systematic review of fault tolerance techniques for smart city applications. Journal of Systems and Software, 219, 112249. https://doi.org/10.1016/j.jss.2024.112249
  • Febrero, F., Calero, C., & Moraga, M. Á. (2016). Software reliability modeling based on ISO/IEC SQuaRE. Information and Software Technology, 70, 18-29. https://doi.org/10.1016/j.infsof.2015.09.006
  • Garousi, V., Felderer, M., Karapıçak, Ç. M., & Yılmaz, U. (2018). Testing embedded software: A survey of the literature. Information and Software Technology, 104, 14-45. https://doi.org/10.1016/j.infsof.2018.06.016
  • Gokhroo, M. K., Govil, M. C., & Pilli, E. S. (2017). Detecting and mitigating faults in cloud computing environment. In 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India (pp. 1-9). https://doi.org/10.1109/CIACT.2017.7977362
  • Iftikhar, M. U., & Weyns, D. (2014). Activforms: Active formal models for self-adaptation. In 9th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, (pp. 125-134). https://doi.org/10.1145/2593929.2593944
  • ISO/IEC 25010:2023. Systems and software engineering — Systems and software Quality Requirements and Evaluation (SQuaRE) — Product quality model. https://www.iso.org/standard/78176.html
  • ISO/IEC/IEEE 24765:2017, Systems and software engineering — Vocabulary. https://www.iso.org/standard/71952.html.
  • Kazimov, T. H., Bayramova, T. A., & Malikova, N. J. (2021). Research of intelligent methods of software testing. System Research & Information Technologies, 4, 42-52. 10.20535/SRIT.2308-8893.2021.4.03
  • Keromytis, A. D. (2007). Characterizing self-healing software systems. In 4th international conference on mathematical methods, models and architectures for computer networks security (MMM-ACNS) (pp. 1-12).
  • Khadse, T. S., & Karmore, S. P., (2016). A Novel Approach for Fault Tolerance Control System and Embedded System Security. Procedia Computer Science, 78, 799-806. https://doi.org/10.1016/j.procs.2016.02.059
  • Klös, V., Göthel, T., & Glesner, S. (2018). Comprehensible and dependable self-learning self-adaptive systems. Journal of systems architecture, 85, 28-42.  https://doi.org/10.1016/j.sysarc.2018.03.004
  • Kovalev, I. V., Saramud, M. V., & Losev, V. V. (2020). Simulation environment for the choice of the decision making algorithm in multi-version real-time system. Information and Software Technology, 120, 106245. https://doi.org/10.1016/j.infsof.2019.106245
  • Kulyagin, V. A., Tsarev, R. Y., Prokopenko, A. V., Nikiforov, A. Y., & Kovalev, I. V. (2015). N-version design of fault-tolerant control software for communications satellite system. In 2015 International Siberian Conference on Control and Communications (SIBCON) (pp. 1-5). https://doi.org/10.1109/SIBCON.2015.7147116
  • Kumar, S., Aggarwal, A. G., Gupta, R., & Kapur, P. K. (2023). Software reliability growth model for n-version fault tolerant software with common and independent faults. International Journal of Reliability, Quality and Safety Engineering, 30(06), 2350026. https://doi.org/10.1142/S0218539323500262
  • Kumari, P. & Kaur, P. (2021). A survey of fault tolerance in cloud computing. Journal of King Saud University-Computer and Information Sciences, 33(10), 1159-1176. https://doi.org/10.1016/j.jksuci.2018.09.021
  • Nafreen, M., Bhattacharya, S., & Fiondella, L. (2020). Architecture-based software reliability incorporating fault tolerant machine learning. In IEEE Annual Reliability and Maintainability Symposium (RAMS) (pp. 1-6). https://doi.org/10.1109/RAMS48030.2020.9153718
  • Qi, B., He, Y., Qiu, M., Zhang, P., Cui, Y., & Liu, S. (2022). Research on the design of software architecture based on asynchronous virtual fault tolerance. In IEEE International Conference on Satellite Computing (Satellite) (pp. 60-61). https://doi.org/10.1109/Satellite55519.2022.00023
  • Randell, B. (1975). System structure for software fault tolerance. In Proceedings of the International Conference on Reliable Software (pp. 437-449).
  • Rani, S. & Kaur, A. (2021). Automatic test case generation and fault-tolerant framework based on N-version and recovery block mechanism. In K. Khanna, V. V. Estrela, J. J. P. C. Rodrigues (Eds), Cyber Security and Digital Forensics. Lecture Notes on Data Engineering and Communications Technologies (pp. 65–74). https://doi.org/10.1007/978-981-16-3961-6_7
  • Rivera, L. J. F. & Chandrasekaran, B. (2021). A Software Based Self-Recovering Robotic System Architecture Using ROS. In 7th IEEE International Conference on Mechatronics and Robotics Engineering (ICMRE) (pp. 29-34). https://doi.org/10.1109/ICMRE51691.2021.9384816
  • Sommerville, I. (2017). Software engineering (10th ed.). Pearson.
  • Temene, N., Naoum, A., Sergiou, C., Georgiou, C., & Vassiliou, V. (2024). A fault tolerant node placement algorithm for WSNs and IoT networks. Computer Networks, 254, 110835. https://doi.org/10.1016/j.comnet.2024.110835
  • Teng, X., & Pham, H. (2003). Software fault tolerance. In Handbook of reliability engineering (pp. 585-611). https://doi.org/10.1007/1-85233-841-5_33
  • Wang, Q., Liu, Y., Xu, T., Sun, X., & Li, J. (2021). Design of Software Fault Tolerant System for Autonomous Underwater Vehicles. In 4th IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC) (pp. 1456-1460). https://doi.org/10.1109/IMCEC51613.2021.9482037