#Number TR-PDS-1997-014 #Title Fault-Tolerant Distributed Simulation #Authors Om P. Damani, Dept. of Computer Sciences, University of Texas at Austin Vijay K. Garg, Dept. of Electrical and Computer Eng., University of Texas at Austin, Austin, TX, 78712, USA #Abstract We present the first discussion of the issues involved in fault-tolerant distributed simulation. We integrate an existing optimistic fault-tolerance scheme with an existing optimistic distributed simulation scheme. We make use of the novel insight that a failure can be modeled as a straggler event with the receive time equal to the virtual time of the last checkpoint saved on stable storage. This results in saving of implementation efforts, as well as reduced overheads. We define stable global virtual time (SGVT), as the virtual time such that no state with a lower timestamp will ever be rolled back. We make a simple change in existing GVT algorithms to compute SGVT. Our use of transitive dependency tracking eliminates antimessages. We club LPs into clusters to minimizee stable storage access time.