#Number TR-PDS-1999-001 #Title Optimistic Recovery in Multi-threaded Distributed Systems #Author Om P. Damani, Ashis Tarafdar, and Vijay K. Garg #Abstract We address the problem of recovering distributed systems from crash failures of multi-threaded processes. Although recovery has been widely studied in the context of traditional non-threaded distributed systems, extending those solutions to the multi-threaded scenario presents new problems. We identify and address these problems for optimistic logging protocols. There are two natural extension to optimistic logging protocols in the multi-threaded scenario. The first extension is {\em process-centric\/}, where the points of internal non-determinism caused by threads are logged. The second extension is {\em thread-centric\/}, where each thread is treated as a separate process. The process-centric approach suffers from false causality while the thread-centric approach suffers from high causality tracking overhead. By observing that the granularity of failures can be different from the granularity of rollbacks, we design a new {\em balanced\/} approach which incurs low causality tracking overhead and also eliminates false causality. #Bib @techreport{YourBibLabelHere, author = "Om P. Damani and Ashis Tarafdar and Vijay K. Garg", title = "Optimistic Recovery in Multi-threaded Distributed Systems", number = "TR-PDS-1999-001", month = "January", note = 1999, note = "available via ftp or WWW at maple.ece.utexas.edu as technical report TR-PDS-1999-001" }