Debugging in a Distributed World: Observation and Control.
Ashis Tarafdar and Vijay K. Garg.
Abstract
Debugging distributed programs is considerably more difficult than debugging sequential programs. We address issues in debugging distributed programs and provide a general framework for observing and controlling a distributed computation and its applications to distributed debugging. Observing distributed computations involves solving the predicate detection problem. We present the main ideas involved in developing efficient algorithms for predicate detection. Controlling distributed computations involves solving the predicate control problem. Predicate control may be used to restrict the behavior of the distributed program to suspicious executions. We also present an example of how predicate detection and predicate control can be used in practice to facilitate distributed debugging.