Document Type : ResearchPaper


Aerospace Research Institute, Ministry of Science, Research and Technology, Tehran, Iran



Diversity plays an essential role in increasing reliability of redundant systems, however in safety and mission critical applications. The onboard computer of satellites and the flight computer of aircrafts, which are ultra-reliable system, utilizes various hardware platforms for their redundant architecture to resolve the common cause failure problem. Furthermore, the software is also developed by separate teams based on different software platforms to mitigate the specification and design flaws, and implementation mistakes. This paper focuses on modelling diversity of redundant systems using Markov reliability analyzing method. The proposed scheme is explored in two types of applications: mission critical applications (with long mission time) and safety critical applications (with short mission time). Analytical and simulation results show the effectiveness of diversity in increasing the reliability of these systems. Since about ten percent of all failures appear as common cause failures, which restrict reliability improvement through similar redundant modules, achieving ultra-reliable necessitate to consider diversity in these systems


Main Subjects

[1] I. E. Commission, “Functional safety of electrical/ electronic/programmable electronic safety-related systems,” in IEC 61508, 2010, p. Parts 1-7.
[2] M. Rausand, A. Barros and A. Hoyland, System reliability theory: models, statistical methods, and applications, John Wiley & Sons, 2020.
[3] H. W. Jones, “Common cause failures and ultra reliability,” in 42nd International Conference on Environmental Systems, 2012.
[4] K. S. Trivedi, M. Grottke and E. Andrade, “Software fault mitigation and availability assurance techniques,” International Journal of System Assurance Engineering and Management, vol. 1, no. 4, pp. 340-350, 2011.
[5] E. Dubrova, Fault-Tolerant Design, KTH Royal Institute of Technology, Sweden: Springer, 2013.
[6] O. PIGI, Diversity, redundancy, segregation and layout of mechanical plant, Office for Nuclear Regulation, 2008.
[7] ISO 26262 Road vehicles – Functional safety, ISO.
[8]lEC 61508. Functional safety of electrical/electronic/ programmable electronic safety-related systems, Geneva: International Electrotechnical (IEC), 1997.
[9] S. Mitra, N. R. Saxena and E. J. McCluskey, “Design Diversity for Redundant Systems‏,” vol. 4, no. 6, 1999.
[10] A. Samanta and K. Basu, “Multi-objective reliability redundancy allocation problem considering two types of common cause failures,” International Journal of System Assurance Engineering and Management, vol. 10, no. 3, pp. 369-383, 2019.
[11] M. A. Lundteigen, Hardware safety integrity (HSI) in IEC 61508/IEC 61511, Lecture slides, 2006.
[12] D. K. Pradhan, Fault-tolerant computer system design, vol. 132, Englewood Cliffs: Prentice-Hall, 1996.
[13] M. Ram, “On system reliability approaches: a brief survey,” International Journal of System Assurance Engineering and Management, vol. 4, no. 2, pp. 101-117, 2013.
[14] G. Kahe, “Reliable flight computer for sounding rocket with dual redundancy: design and implementation based on COTS parts,” International Journal of System Assurance Engineering and Management, vol. 8, no. 3, p. 560–571, 2017.
[15] G. Kahe, “Triple-Triple Redundant Reliable Onboard Computer Based on Multicore Microcontrollers,” International Journal of Reliability, Risk and Safety: Theory and Application, vol. 1, no. 1, pp. 17-24, 2018.
[16] A. Avizienis and J. P. Kelly, “Fault tolerance by design diversity: Concepts and experiments,” Computer, vol. 8, pp. 67-80, 1984.
[17] J. P. J. Kelly, T. I. McVittie and W. I. Yamamoto, “Implementing design diversity to achieve fault tolerance,” IEEE Software, vol. 8, no. 4, pp. 61-71, 1991.
[18] M. Hunger and S. Hellebrand, “The impact of manufacturing defects on the fault tolerance of TMR-systems,” in IEEE 25th International Symposium on Defect and Fault Tolerance in VLSI Systems, 2010.
[19] S. Mitra, N. R. Saxena and E. J. McCluskey, “A design diversity metric and reliability analysis for redundant systems,” in International IEEE Test Conference, 1999.
[20] J. H. Lala and R. E. Harper, “Architectural principles for safety-critical real-time applications,” Proceedings of the IEEE, vol. 82, no. 1, pp. 25-40, 1994.
[21] Y. Yeh, “Triple-triple redundant 777 primary flight computer,” in IEEE Aerospace Applications Conference, Aspen, CO, USA, 1998.
[22] J.D.Aplin, “Primary flight computers for the Boeing 777,” Microprocessors and Microsystems, vol. 20, no. 8, pp. 473-478, 1997.
[23] Y. Yeh, “Design considerations in Boeing 777 fly-by-wire computers,” in Third IEEE International High-Assurance Systems Engineering Symposium (Cat. No.98EX231), Washington, DC, USA, 1998.
[24] J. Fielding, Introduction to aircraft design, Cambridge University Press, 2017.
[25] H. W. Jones, “Diverse Redundant Systems for Reliable Space Life Support,” in 45th International Conference on Environmental Systems, Washington, 2015.
[26] L. F. G. T. R. B. M. L. Gabriel de M. Borges, “Diversity TMR: Proof of Concept Diversity TMR: Proof of Concept,” 2010.
[27] R. A. Ashraf, O. Mouri, R. Jadaa and R. F. DeMara, “Design-For-Diversity for Improved Fault-Tolerance of TMR Systems on FPGAs,” in International Conference on Reconfigurable Computing and FPGAs, 2011.
[28] I. Koren and C. M. Krishna, Fault-tolerant systems‏, Elsevier, 2010.
[29] M. Hamill and K. Goseva-Popstojanova, “Common trends in software fault and failure data,” IEEE Transactions on Software Engineering, vol. 35, no. 4, pp. 484-496, 2009.
[30] L. Bergmane, J. Grabis and E. Žeiris, “A Case Study: Software Defect Root Causes,” Information Technology and Management Science, vol. 20, no. 1, pp. 54-57, 2017.
[31] K. N. Fleming, “Reliability model for common mode failures in redundant safety systems,” No. GA-A-13284. General Atomics, San Diego, CA, United States1974.
[32] J. D. Andrews and T. R. Moss, Reliability and Risk Assessment,, New York: The American Sociery of Mechanical Engineers, 2002.
[33] B. D. JOHNSTON, “A structured procedure for dependent failure analysis (DFA),” Reliability Engineering, vol. 19, no. 2, p. 125–136, 1987.
[34] M. Langer and J. Bouwmeester, “Reliability of cubesats-statistical data, developers' beliefs and the way forward,” in 30th Annual AIAA/USU Conference on Small Satellites, 2016.
[35] R. A. Humphreys, “Assigning a numerical value to the beta factor common cause evaluation,” Reliability'87, 1987.