نوع مقاله : مقالة‌ تحقیقی‌ (پژوهشی‌)

نویسندگان

Assistant Professor, Aerospace Research Institute, Ministry of Science, Research and Technology, Tehran, Iran

چکیده

Diversity in both hardware and software plays an essential and unmatched role in increasing the reliability of redundant systems, especially in safety and mission critical applications. The onboard computer of satellites and the flight computer of spacecrafts, which are ultra-reliable systems, utilize various hardware platforms for their redundant architecture to resolve a common cause failure (CCF) problem. Furthermore, the software is also developed by separate teams based on different software platforms to mitigate the specification and design flaws, and implementation mistakes. This paper focuses on modelling the diversity of redundant architectures in space systems using CCF modelling and Markov reliability analyzing. The proposed scheme is explored in two types of applications: mission critical applications (with long mission time) and safety critical applications (with short mission time). Analytical and simulation results show the effectiveness of diversity in increasing the reliability of these systems. Since a significant percentage of all failures appear as common cause failures, which restrict reliability improvement through similar redundant modules, achieving ultra-reliability necessitates considering diversity in these systems.

کلیدواژه‌ها

موضوعات

عنوان مقاله [English]

Evaluation of Diversity Effects on Increasing the Reliability of Space Systems

نویسندگان [English]

  • Ghasem Kahe
  • Mehdi Alemi Rostami

Assistant Professor, Aerospace Research Institute, Ministry of Science, Research and Technology, Tehran, Iran

چکیده [English]

Diversity in both hardware and software plays an essential and unmatched role in increasing the reliability of redundant systems, especially in safety and mission critical applications. The onboard computer of satellites and the flight computer of spacecrafts, which are ultra-reliable systems, utilize various hardware platforms for their redundant architecture to resolve a common cause failure (CCF) problem. Furthermore, the software is also developed by separate teams based on different software platforms to mitigate the specification and design flaws, and implementation mistakes. This paper focuses on modelling the diversity of redundant architectures in space systems using CCF modelling and Markov reliability analyzing. The proposed scheme is explored in two types of applications: mission critical applications (with long mission time) and safety critical applications (with short mission time). Analytical and simulation results show the effectiveness of diversity in increasing the reliability of these systems. Since a significant percentage of all failures appear as common cause failures, which restrict reliability improvement through similar redundant modules, achieving ultra-reliability necessitates considering diversity in these systems.

کلیدواژه‌ها [English]

  • Reliability
  • Redundancy
  • Diversity
[1] I. E. Commission, “Functional safety of electrical/ electronic/programmable electronic safety-related systems,” in IEC 61508, 2010, p. Parts 1-7.
[2] M. Rausand, A. Barros and A. Hoyland, System reliability theory: models, statistical methods, and applications, John Wiley & Sons, 2020.
[3] H. W. Jones, “Common cause failures and ultra reliability,” in 42nd International Conference on Environmental Systems, 2012.
[4] K. S. Trivedi, M. Grottke and E. Andrade, “Software fault mitigation and availability assurance techniques,” International Journal of System Assurance Engineering and Management, vol. 1, no. 4, pp. 340-350, 2011.
[5] E. Dubrova, Fault-Tolerant Design, KTH Royal Institute of Technology, Sweden: Springer, 2013.
[6] O. PIGI, Diversity, redundancy, segregation and layout of mechanical plant, Office for Nuclear Regulation, 2008.
[7] ISO 26262 Road vehicles – Functional safety, ISO.
[8]lEC 61508. Functional safety of electrical/electronic/ programmable electronic safety-related systems, Geneva: International Electrotechnical (IEC), 1997.
[9] S. Mitra, N. R. Saxena and E. J. McCluskey, “Design Diversity for Redundant Systems‏,” vol. 4, no. 6, 1999.
[10] A. Samanta and K. Basu, “Multi-objective reliability redundancy allocation problem considering two types of common cause failures,” International Journal of System Assurance Engineering and Management, vol. 10, no. 3, pp. 369-383, 2019.
[11] M. A. Lundteigen, Hardware safety integrity (HSI) in IEC 61508/IEC 61511, Lecture slides, 2006.
[12] D. K. Pradhan, Fault-tolerant computer system design, vol. 132, Englewood Cliffs: Prentice-Hall, 1996.
[13] M. Ram, “On system reliability approaches: a brief survey,” International Journal of System Assurance Engineering and Management, vol. 4, no. 2, pp. 101-117, 2013.
[14] G. Kahe, “Reliable flight computer for sounding rocket with dual redundancy: design and implementation based on COTS parts,” International Journal of System Assurance Engineering and Management, vol. 8, no. 3, p. 560–571, 2017.
[15] G. Kahe, “Triple-Triple Redundant Reliable Onboard Computer Based on Multicore Microcontrollers,” International Journal of Reliability, Risk and Safety: Theory and Application, vol. 1, no. 1, pp. 17-24, 2018.
[16] A. Avizienis and J. P. Kelly, “Fault tolerance by design diversity: Concepts and experiments,” Computer, vol. 8, pp. 67-80, 1984.
[17] J. P. J. Kelly, T. I. McVittie and W. I. Yamamoto, “Implementing design diversity to achieve fault tolerance,” IEEE Software, vol. 8, no. 4, pp. 61-71, 1991.
[18] M. Hunger and S. Hellebrand, “The impact of manufacturing defects on the fault tolerance of TMR-systems,” in IEEE 25th International Symposium on Defect and Fault Tolerance in VLSI Systems, 2010.
[19] S. Mitra, N. R. Saxena and E. J. McCluskey, “A design diversity metric and reliability analysis for redundant systems,” in International IEEE Test Conference, 1999.
[20] J. H. Lala and R. E. Harper, “Architectural principles for safety-critical real-time applications,” Proceedings of the IEEE, vol. 82, no. 1, pp. 25-40, 1994.
[21] Y. Yeh, “Triple-triple redundant 777 primary flight computer,” in IEEE Aerospace Applications Conference, Aspen, CO, USA, 1998.
[22] J.D.Aplin, “Primary flight computers for the Boeing 777,” Microprocessors and Microsystems, vol. 20, no. 8, pp. 473-478, 1997.
[23] Y. Yeh, “Design considerations in Boeing 777 fly-by-wire computers,” in Third IEEE International High-Assurance Systems Engineering Symposium (Cat. No.98EX231), Washington, DC, USA, 1998.
[24] J. Fielding, Introduction to aircraft design, Cambridge University Press, 2017.
[25] H. W. Jones, “Diverse Redundant Systems for Reliable Space Life Support,” in 45th International Conference on Environmental Systems, Washington, 2015.
[26] L. F. G. T. R. B. M. L. Gabriel de M. Borges, “Diversity TMR: Proof of Concept Diversity TMR: Proof of Concept,” 2010.
[27] R. A. Ashraf, O. Mouri, R. Jadaa and R. F. DeMara, “Design-For-Diversity for Improved Fault-Tolerance of TMR Systems on FPGAs,” in International Conference on Reconfigurable Computing and FPGAs, 2011.
[28] I. Koren and C. M. Krishna, Fault-tolerant systems‏, Elsevier, 2010.
[29] M. Hamill and K. Goseva-Popstojanova, “Common trends in software fault and failure data,” IEEE Transactions on Software Engineering, vol. 35, no. 4, pp. 484-496, 2009.
[30] L. Bergmane, J. Grabis and E. Žeiris, “A Case Study: Software Defect Root Causes,” Information Technology and Management Science, vol. 20, no. 1, pp. 54-57, 2017.
[31] K. N. Fleming, “Reliability model for common mode failures in redundant safety systems,” No. GA-A-13284. General Atomics, San Diego, CA, United States1974.
[32] J. D. Andrews and T. R. Moss, Reliability and Risk Assessment,, New York: The American Sociery of Mechanical Engineers, 2002.
[33] B. D. JOHNSTON, “A structured procedure for dependent failure analysis (DFA),” Reliability Engineering, vol. 19, no. 2, p. 125–136, 1987.
[34] M. Langer and J. Bouwmeester, “Reliability of cubesats-statistical data, developers' beliefs and the way forward,” in 30th Annual AIAA/USU Conference on Small Satellites, 2016.
[35] R. A. Humphreys, “Assigning a numerical value to the beta factor common cause evaluation,” Reliability'87, 1987.