Paper Review of Distributed snapshots: determining global states of distributed systems
Paper Summary
This paper presents an algorithm to determinate a global state of a distributed system during the computation where the processors in the system do not share a common clock or memory by taking distributed snapshots. This global state can help to detect service stability by identifying errors like deadlock and computation termination.
Paper Details
This paper assumes that a distributed system is consisted of a series of processes and directed channels which can construct a DAG where processes are vertices and channels are edges. And each process can compute, send/receive messages and record state by itself. So the state on the process can be presented as a tuple \(e = <p, s, s', M ,c>\) where \(p\) is the process, \(s\) is the state of \(p\) before \(e\) and \(s'\) is the state of \(p\) after \(e\), \(M\) is the delivered message and \(c\) is the channel to communicate. And each channel is error-free, has a infinity sending and receiving buffer and messages is ordered.
And the minimized system is shown below.
So when a message is delivered, we have four status:
Message in \(p\)
Message in \(c\)
Message in \(c'\)
Message in \(q\)
The errors can happens every where but we cannot determinate where it happens, so the paper introduce an algorithm to snapshot the system.
Distributed Snapshots Algorithm
This paper introduce the following algorithm to take a snapshot of the whole system:
for Sender \(P\)
\(P\) keeps track of its state (computational result or system attributes) and sends a marker to each channel.
for Receiver \(Q\)
If \(Q\)'s state hasn't been recorded, record it and set channel \(c\)'s state to null.
If \(Q\)'s state has been recorded, then from the first message \(Q\) received to the marker, \(Q\) set channel \(c\)'s state messages.
Strong Points
This paper gives a appropriate definition of the distributed system model in terms of process, channels, events, and global state, as well as the interaction between them.
This paper uses basic sending and receiving principles under a no sharing clock and memory environment to solve the problem.
This paper's algorithm does not interrupt the normal execution of process.
Weak Points
This paper gives a limited networking environment where channels never fail and messages are ordered.
The algorithm's output can be regarded as a decent prediction rather than the "real" state.
Improvement
We may introduce a global clock and take more tight snapshots at a same globally time, which should result in a fine grain global state that is closer to the reality.
Subscribe to my newsletter
Read articles from Siyuan Zhang directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Siyuan Zhang
Siyuan Zhang
I am a senior from Shanghaitech U, a leader of GeekPie HPC team, looking for a master position in US. My interesting field consists of HPC, Mlsys, AI+HPC, and AI4Science. Currently, I am working on a AI training job scheduler in HPC, and a new NeRF in cloud rendering envirnment.