Many industrial systems have specific requirements derived from the applications they execute. Specifically, the interaction of a Distributed Embedded Control System (DECS) with the real world imposes strict real-time and reliability requirements. For a system to be real-time it has to produce a proper result in a bounded time. On top of that, for a system to be reliable it has to operate continuously during its mission time and, in cases in which very high reliability is needed, Fault Tolerance (FT) techniques are used. Moreover, these systems are often deployed in dynamic environments where the operational conditions may change in an unpredictable manner. Therefore, there is an increasing interest in creating DECS that are capable of modifying their behaviour autonomously and dynamically in response to unexpectedly changing requirements or conditions. In recent years, there is a growing trend towards using Ethernet as the network technology for DECS. Unfortunately, the original specification of this technology lacks appropriate services to fulfil the most demanding requirements of industrial systems. In this regard, many Ethernet-based protocols and standards have been proposed along the last years to deal with these limitations. In this work we survey solutions that have been proposed to achieve FT in Ethernet-based DECSs, considering faults both in their nodes and communication subsystem. Additionally, we discuss adaptive FT techniques that can be used to increase the flexibility of adaptive DECS. Finally, we identify future trends and open challenges to build highly-reliable DECS in the future.
DOI: 10.1109/JPROC.2019.2914589
Authors Inés Álvarez Vadillo | Alberto Ballesteros | Manuel Alejandro Barranco González | David Gessner | Sinisa Derasevic | Julián Proenza Arenas
In Proceedings of the IEEE, vol. 107, no. 6, pp. 977-1010, June, 2019.