Abstract
We provide novel epistemic logical language and semantics for modeling and analysis of byzantine fault-tolerant multi-agent systems, with the intent of not only facilitating reasoning about the agents’ fault status but also supporting model updates for repair and state recovery. Besides the standard knowledge modalities, our logic provides additional agent-specific hope modalities capable of expressing that an agent is not faulty, and also dynamic modalities enabling change to the agents’ correctness status. These dynamic modalities are interpreted as model updates that come in three flavors: fully public, more private, and/or involving factual change. Tailored examples demonstrate the utility and flexibility of our logic for modeling a wide range of fault-detection, isolation, and recovery (FDIR) approaches in mission-critical distributed systems. By providing complete axiomatizations for all variants of our logic, we also create a foundation for building future verification tools for this important class of fault-tolerant applications.
K. Fruzsa—Was a PhD student in the FWF doctoral program LogiCS (W1255) and also supported by the FWF project DMAC [10.55776/P32431].
R. Kuznets—Funded by the FWF ByzDEL project [10.55776/P33600].
You have full access to this open access chapter, Download conference paper PDF
Keywords
1 Introduction and Overview
State of the Art. A few years ago, the standard epistemic analysis of distributed systems via the runs-and-systems framework [13, 18, 28] was finally extended [22,23,24] to fault-tolerant systems with (fully) byzantine agents [25]. Byzantine agents constitute the worst-case scenario in terms of fault-tolerance: not only can they arbitrarily deviate from their respective protocols, but the perception of their own actions and observed events can be corrupted, possibly unbeknownst to them, resulting in false memories. Whether byzantine agents are actually present in a system, the very possibility of their presence has drastic and debilitating effects on the epistemic state of all agents, including the correct (i.e., non-faulty) ones, due to the inability to rule out so-called brain-in-a-vat scenarios [29]: a brain-in-a-vat agent is a faulty agent with completely corrupted perceptions that provide no reliable information about the system [23]. In such a system, no agent can ever know certain elementary facts, such as their own or some other agent’s correctness, no matter whether the system is asynchronous [23] or synchronous [34]. Agents can, however, sometimes know their own faultiness or obtain belief in some other agents’ faultiness [33].
In light of knowledge \(K_i \varphi \) often being unachievable in systems with byzantine agents, [23] also introduced a weaker epistemic notion called hope. It was initially defined as \(H_i \varphi := correct_{i} \rightarrow K_i( correct_{i} \rightarrow \varphi )\), where the designated atom \( correct_{i} \) represents agent i’s correctness. In this setting, one can define belief as \(B_i \varphi :=K_i( correct_{i} \rightarrow \varphi )\) [33]. Hope was successfully used in [15] to analyze the Firing Rebels with Relay (FRR) problem, which is the core of the well-known consistent broadcasting primitive [36]. Consistent broadcasting has been used as a pivotal building block in fault-tolerant distributed algorithms, e.g., for byzantine fault-tolerant clock synchronization [10, 16, 31, 36, 39], synchronous consensus [37], and as a general reduction of distributed task solvability in systems with byzantine failures to solvability in systems with crash failures [26].
The hope modality was first axiomatized in [14] using \( correct_{i} \) as designated atoms. Whereas the resulting logic turned out to be well-suited for modeling and analyzing problems in byzantine fault-tolerant distributed computing systems like FRR [15], it is unfortunately not normal. Our long-term goal of also creating the foundations for automated verification of such applications hence suggested to look for an alternative axiomatization. In [6], we presented a normal modal logic that combines \(\textsf{KB4}_n\) hope modalities with \(\textsf{S5}_n\) knowledge modalities, which is based on defining \( correct_{i} :=\lnot H_i \bot \) via frame-characterizable axioms. This logic indeed unlocks powerful techniques developed for normal modal logics both in model checkers like DEMO [11] or MCK [17] and, in particular, in epistemic theorem proving environments such as LWB [20].
Still, both versions [6, 14] of the logic of hope target byzantine fault-tolerant distributed systems only where, once faulty, agents remain faulty and cannot be “repaired” to become correct again. Indeed, solutions for problems like FRR employ fault-masking techniques based on replication [35], which prevent the adverse effects of the faulty agents from contaminating the behavior of the correct agents but do not attempt to change the behavior of the faulty agents. Unfortunately, fault masking is only feasible if no more than a certain fraction f of the overall n agents in the system may become faulty (e.g., \(n \ge 3f+1\) in the case of FRR). Should it ever happen that more than f agents become faulty in a run, no properties can typically be guaranteed anymore, which would be devastating in mission-critical applications.
Fault-detection, isolation, and recovery (FDIR) is an alternative fault-tolerance technique, which attempts to discover and repair agents that became faulty in order to subsequently re-integrate them into the system. The primary target here are permanent faults, which do not go away “by themselves” after some time but rather require explicit corrective actions. Pioneering fault-tolerant systems implementations like MAFT [21] and GUARDS [30] combined fault-masking techniques like byzantine agreement [25] and FDIR approaches to harvest the best of both worlds.
Various paradigms have been proposed for implementing the steps in FDIR: Fault-detection can be done by a central FDIR unit, which is implemented in some very reliable technology and oversees the whole distributed system. Alternatively, distributed FDIR employs distributed diagnosis [38], e.g., based on evidence [1], and is typically combined with byzantine consensus [25] to ensure agreement among the replicated FDIR units. Agents diagnosed as faulty are subsequently forced to reset and execute built-in self tests, possibly followed by repair actions like hardware reconfiguration. Viewed at a very abstract level, the FDI steps of FDIR thus cause a faulty agent to become correct again. Becoming correct again is, however, not enough to enable the agent to also participate in the (on-going) execution of the remaining system. The latter also requires a successful state recovery step R, which makes the local state of the agent consistent with the current global system state. Various recovery techniques have been proposed for this purpose, ranging from pro-active recovery [32], where the local state of every agent is periodically replaced by a majority-voted version, to techniques based on checkpointing & rollback or message-logging & replay, see [12] for a survey. The common aspect of all these techniques is that the local state of the recovering agent is changed based on information originating from other agents.
Our Contribution. In this paper,Footnote 1 we provide the first logic that not only enables one to reason about the fault status of agents, but also provides mechanisms for updating the model so as to change the fault status of agents, as well as their local states. Instead of handling such dynamics in the byzantine extension of the runs-and-systems framework [22,23,24], i.e., in a temporal epistemic setting, we do it in a dynamic epistemic setting: we restrict our attention to the instants where the ultimate goal of (i) the FDI steps (successfully repairing a faulty processor) and (ii) the R step (recovering the repaired processor’s local state) is reached, and investigate the dynamics of the agents’ correctness/faultiness and its interaction with knowledge at these instants.
Our approach enables us to separate the issue of (1) verifying the correctness of the specification of an FDIR mechanism from the problem of (2) guaranteeing the correctness of its protocol implementation, and to focus on (1). Indeed, verifying the correctness of the implementation of some specification is the standard problem in formal verification, and powerful tools exist that can be used for this purpose. However, even a fully verified FDIR protocol would be completely useless if the FDIR specification was erroneous from the outset, in the sense that it does not correctly identify and hence repair faulty agents in some cases. Our novel logics and the underlying model update procedures provide, to the best of our knowledge, the first suitable foundations for (1), as they allow to formally specify (1.a) when a model update shall happen, and (1.b) the result of the model update. While we cannot claim that no better approach exists, our various examples at least reveal that we can model many crucial situations arising in FDIR schemes.
In order to introduce the core features of our logic and its update mechanisms, we use a simple example: Consider two agents a and b, each knowing their own local states, where global state ij, with \(i,j \in \{0,1\}\), means that a’s local state is i and b’s local state is j. To describe agent a’s local state i we use an atomic proposition \(p_a\), where \(p_a\) is true if \(i=1\) in global state ij and \(p_a\) is false if \(i=0\), and similarly for b’s local state j and atomic proposition \(p_b\).
Knowledge and hope of the agents is represented in a Kripke model M for our system consisting of four states (worlds), shown in the left part of the figure above. Knowledge \(K_i\) is interpreted by a knowledge relation \(\mathcal {K}_i\) and hope \(H_i\) is interpreted by a hope relation \(\mathcal {H}_i\). Worlds that are \(\mathcal {K}_i\)-indistinguishable, in the sense that agent i cannot distinguish which of the worlds is the actual one, are connected by an i-labeled link, where we assume reflexivity, symmetry, and transitivity. Worlds ij that are in the non-empty part of the \(\mathcal {H}_i\) relation, where agent i is correct, have i outlined as \(\mathbb {0}\) or . For example, in the world depicted as above, agent a is faulty and agent b is correct.
Now assume that we want agent a to become correct in states 01 and 11 where \(p_b\) is true. For example, this could be dictated by an FDIR mechanism that caused b to diagnose a as faulty. Changing the fault status of a accordingly (while not changing the correctness of b) results in the updated model on the right in the above figure. Note that a was correct in state 00 in the left model, but did not know this, whereas agent a knows that she is correct in state 00 after the update. Such a model update will be specified in our approach by a suitable hope update formula for every agent, which, in the above example, is \(\lnot H_a \bot \vee p_{b}\) for agent a and \(\lnot H_{b} \bot \) for agent b. Note carefully that every hope update formula implicitly specifies both (a) the situation in the original model in which a change of the hope relation is applied, namely, some agent i’s correctness/faultiness status encoded as \(\lnot H_i\bot /H_i \bot \), and (b) the result of the respective update of the hope relation.
Clearly, different FDIR approaches will require very different hope update formulas for describing their effects. In our logic, we provide two basic hope update mechanisms that can be used here: public updates, in which the agents are certain about the exact hope updates occurring at other agents, and private updates (strictly speaking, semi-private updates [5]), in which the agents may be uncertain about the particular hope updates occurring at other agents. The former is suitable for FDIR approaches where a central FDIR unit in the system triggers and coordinates all FDIR activities, the latter is needed for some distributed FDIR schemes.
Moreover, whereas the agents’ local states do not necessarily have to be changed when becoming correct, FDIR usually requires to erase traces of erroneous behavior before recovery from the history in the R step. Our logic hence provides an additional factual change mechanism for accomplishing this as well. For example, simultaneously with or after becoming correct, agents may also need to change their local state by making false the atomic proposition that records that step 134 of the protocol was (erroneously) executed. Analogous to hope update formulas, suitable factual change formulas are used to encode when and how atomic propositions will change. Besides syntax and semantics, we provide complete axiomatizations of all variants of our logic, and demonstrate its utility and flexibility for modeling a wide range of FDIR mechanisms by means of many application examples. In order to focus on the essentials, we use only 2-agent examples for highlighting particular challenges arising in FDIR. We note, however, that it is usually straightforward to generalize those for more than two agents, and to even combine them for modeling more realistic FDIR scenarios.
Summary of the Utility of Our Logic. Besides contributing novel model update mechanisms to the state-of-the-art in dynamic epistemic logic, the main utility of our logic is that it enables epistemic reasoning and verification of FDIR mechanism specifications. Indeed, even a fully verified protocol implementation of some FDIR mechanism would be meaningless if its specification allowed unintended effects. Our hope update/factual change formulas formally and exhaustively specify what the respective model update accomplishes, i.e., encode both the preconditions for changing some agent’s fault status/atomic propositions and the actual change. Given an initial model and these update formulas, our logic thus enables one to check (even automatically) whether the updated model has all the properties intended by the designer, whether certain state invariants are preserved by the update, etc. Needless to say, there are many reasons why a chosen specification might be wrong in this respect: the initial model might not provide all the required information, undesired fault status changes could be triggered in some worlds, or supporting information required for an agent to recover its local state might not be available. The ability to (automatically) verify the absence of such undesired effects of the specification of an FDIR mechanism is hence important in the design of mission-critical distributed systems.
Paper Organization. Section 2 recalls the syntax and semantics of the logic for knowledge and hope [6]. Section 3 expands this language with dynamic modalities for publicly changing hope. Section 4 generalizes the language to private updates. In Sect. 5, we add factual change to our setting. Some conclusions in Sect. 6 complete our paper.
2 A Logic of Hope and Knowledge
We succinctly present the logic of hope and knowledge [6]. Throughout our presentation, let \(\mathcal {A}:=\{1, \dots , n\}\) be a finite set of agents and let \(\textsf{Prop}\) be a non-empty countable set of atomic propositions.
Syntax. The language \(\mathcal {L}_{\textit{KH}}\) is defined as
where \(p \in \textsf{Prop}\) and \(i \in \mathcal {A}\). We take \(\top \) to be the abbreviation for some fixed propositional tautology and \(\bot \) for \(\lnot \top \). We also use standard abbreviations for the remaining boolean connectives, \(\widehat{K}_i \varphi \) for the dual modality \(\lnot K_i\lnot \varphi \) for ‘agent a considers \(\varphi \) possible’, \(\widehat{H}_i \varphi \) for \(\lnot H_i \lnot \varphi \), and \(E_G \varphi \) for mutual knowledge \(\bigwedge _{i \in G} K_i \varphi \) in a group \(G \subseteq \mathcal {A}\). Finally, we define belief \(B_i \varphi \) as \(K_i (\lnot H_i \bot \rightarrow \varphi )\); we recall that \(\lnot H_i \bot \) means that i is correct.
Structures. A Kripke model is a tuple \(M=(W,\pi ,\mathcal {K},\mathcal {H})\) where W is a non-empty set of worlds (or states), \(\pi :\textsf{Prop}\rightarrow \mathcal {P}(W)\) is a valuation function mapping each atomic proposition to the set of worlds where it is true, and \(\mathcal {K}: \mathcal {A}\rightarrow \mathcal {P}(W \times W)\) and \(\mathcal {H}: \mathcal {A}\rightarrow \mathcal {P}(W \times W)\) are functions that assign to each agent i a knowledge relation \(\mathcal {K}_i \subseteq W \times W\) respectively a hope relation \(\mathcal {H}_i \subseteq W \times W\), where we have written \(\mathcal {K}_i\) resp. \(\mathcal {H}_i\) for \(\mathcal {K}(i)\) and \(\mathcal {H}(i)\). We write \(\mathcal {H}_i(w)\) for \(\{v \mid (w,v) \in \mathcal {H}_i\}\) and \(w\mathcal {H}_iv\) for \((w,v) \in \mathcal {H}_i\), and similarly for \(\mathcal {K}_i\). We require knowledge relations \(\mathcal {K}_i\) to be equivalence relations and hope relations \(\mathcal {H}_i\) to be shift-serial (that is, if \(w\mathcal {H}_iv\), then there exists a \(z\in W\) such that \(v\mathcal {H}_iz\)). In addition, the following conditions should also be satisfied:
It can be shown that all \(\mathcal {H}_i\) relations are so-called partial equivalence relations: they are transitive and symmetric binary relations [27].
The class of Kripke models \((W,\pi ,\mathcal {K},\mathcal {H})\) (given \(\mathcal {A}\) and \(\textsf{Prop}\)) is named \(\mathcal {K}\mathcal {H}\).
Semantics. We define truth for formulas \(\varphi \in \mathcal {L}_{\textit{KH}}\) at a world w of a model \(M = (W,\pi ,\mathcal {K},\mathcal {H}) \in \mathcal {K}\mathcal {H}\) in the standard way: in particular, \(M,w \models p\) iff \(w \in \pi (p)\) where \(p \in \textsf{Prop}\); boolean connectives are classical; \(M,w \models K_i \varphi \) iff \(M,v \models \varphi \) for all v such that \(w \mathcal {K}_i v\); and \(M,w \models H_i \varphi \) iff \(M,v \models \varphi \) for all v such that \(w \mathcal {H}_i v\). A formula \(\varphi \) is valid in model M, denoted \(M\models \varphi \), iff \(M,w \models \varphi \) for all \(w \in W\), and it is valid, notation \(\models \varphi \) (or \(\mathcal {K}\mathcal {H}\models \varphi \)) iff it is valid in all models \(M \in \mathcal {K}\mathcal {H}\).
The axiom system \(\mathscr {K}\!\!\mathscr {H}\) for knowledge and hope is given below.
Theorem 1
([6]). \(\mathscr {K}\!\!\mathscr {H}\) is sound and complete with respect to \(\mathcal {K}\mathcal {H}\).
3 Public Hope Update
3.1 Syntax and Semantics
Definition 2
(Logical language). Language \(\mathcal {L}_{\textit{KH}}^{\textit{pub}}\) is obtained by adding the construct \([\underbrace{\varphi ,\dots ,\varphi }_n]\varphi \) to BNF (1).
We read a formula of the shape \([\varphi _1,\dots ,\varphi _n]\psi \), often abbreviated as \([\vec {\varphi }]\psi \) as follows: after revising or updating hope for agent i with respect to \(\varphi _i\) for all agents \(i \in \mathcal {A}\) simultaneously, \(\psi \) (is true). We call the formula \(\varphi _i\) the hope update formula for agent i.
Definition 3
(Semantics of public hope update). Let a tuple \(\vec {\varphi }\in (\mathcal {L}_{\textit{KH}}^{\textit{pub}})^n\), a model \(M = (W,\pi ,\mathcal {K}, \mathcal {H}) \in \mathcal {K}\mathcal {H}\), and a world \(w \in W\) be given. Then
where \({M^{\vec {\varphi }}} :={(W,\pi ,\mathcal {K},\mathcal {H}^{\vec {\varphi }})}\) such that for each agent \(i \in \mathcal {A}\):
and where we write \(\mathcal {H}^{\chi }_i\) for \((\mathcal {H}^{\vec {\varphi }})_i\) if the i-th formula in \(\vec {\varphi }\) is \(\chi \).
If \(M,w \not \models \chi \), then \(\mathcal {H}^{\chi }_{i}(w) = \varnothing \): agent i is faulty in state w after the update, i.e., \(H_i\bot \) is true. Whereas if \(M,w \models \chi \), then \(\mathcal {H}^{\chi }_{i}(w) \ne \varnothing \): agent i is correct in state w after the update, i.e., \(\lnot H_i\bot \) is true. If the hope update formula for agent i is \(\lnot H_i \bot \), then \(\lnot H_i \bot \) is true in the same states before and after the update. Therefore, \(\mathcal {H}^{\lnot H_i \bot }_{i}=\mathcal {H}_{i}\): the hope relation for i does not change. On the other hand, if the hope update formula for agent i is \(H_i \bot \), then \(\mathcal {H}^{H_i \bot }_{i}(w) = \varnothing \) iff \(\mathcal {H}_i(w) \ne \varnothing \): the correctness of agent i flips in every state. If we wish to model that agent i becomes more correct (in the model), then the hope update formula for agent i should have the shape \(\lnot H_i \bot \vee \varphi \): the left disjunct \(\lnot H_i \bot \) guarantees that in all states where i already was correct, she remains correct. We write
Similarly, we write \([\varphi ]_G\psi \) if the hope update formulas for all agents \(i\in G\) is \(\varphi \) and other agents j have the trivial hope update formula \(\lnot H_j \bot \).
Proposition 4
If \(\vec {\varphi }\in (\mathcal {L}_{\textit{KH}}^{\textit{pub}})^n\) and \(M = (W,\pi ,\mathcal {K},\mathcal {H}) \in \mathcal {K}\mathcal {H}\) then \(M^{\vec {\varphi }} \in \mathcal {K}\mathcal {H}\).
Proof
Let \(i \in \mathcal {A}\) and \(\chi \) be the ith formula in \(\vec {\varphi }\). We need to show that relation \(\mathcal {H}^{\chi }_i\) is shift-serial and that it satisfies properties \(\mathcal {H}\textrm{in}\mathcal {K}\) and \(\textrm{one}\mathcal {H}\).
-
[shift-serial]: Let \(w \in W\). Assume \(v \in \mathcal {H}^{\chi }_i (w)\), that is, \(w \mathcal {K}_i v\), and \(M, w \models \chi \) and \(M, v \models \chi \). Now \(v \mathcal {K}_i w \) follows by symmetry of \(\mathcal {K}_i\). Therefore, \(\mathcal {H}^{\chi }_i (v) \ne \varnothing \) since \(w \in \mathcal {H}^{\chi }_i (v)\).
-
[\(\mathcal {H}\textrm{in}\mathcal {K}\)]: This follows by definition.
-
[\(\textrm{one}\mathcal {H}\)]: Let \(w, v \in W\). Assume that \(\mathcal {H}^{\chi }_i (w) \ne \varnothing \), that \(\mathcal {H}^{\chi }_i (v) \ne \varnothing \), and that \(w \mathcal {K}_i v\). It follows that there exists some \(w' \in \mathcal {H}^{\chi }_i (w)\), implying that \(M, w \models \chi \), and \(v' \in \mathcal {H}^{\chi }_i (v)\), implying that \(M, v \models \chi \). Now \(w \mathcal {H}^{\chi }_i v\) follows immediately. \(\square \)
The hope update \(\varphi \) for an agent a is reminiscent of the refinement semantics of public announcement \(\varphi \) [4]. However, unlike a public announcement, the hope update installs an entirely novel hope relation and discards the old one.
3.2 Applications
In this section, we apply the logical semantics just introduced to represent some typical scenarios that occur in FDIR applications. We provide several simple two-agent examples.
Example 5
(Correction based on agent b having diagnosed a as faulty). To correct agent a based on \(K_{b} H_{a} \bot \), we update agent a’s hope relation based on formula \(\lnot H_a \bot \vee K_{b} H_{a} \bot \) (and agent b’s hope relation based on formula \(\lnot H_{b} \bot \)). We recall that the disjunct \(\lnot H_a \bot \) guarantees that agent a will stay correct if he already was. The resulting model transformation is:
After the update, in states 00 where a was correct and 10 where a was faulty:
A straightforward generalization of this hope update is correction based on distributed fault detection, where all agents in some sufficiently large group G need to diagnose agent a as faulty. If G is fixed, \(\lnot H_a \bot \vee E_G H_{a} \bot \) achieves this goal. If any group G of at least \(k>1\) agents is eligible, then \(\lnot H_a \bot \vee \bigvee _{G\subseteq \mathcal {A}}^{|G|=k} E_G H_{a} \bot \) is the formula of choice.
Unfortunately, Example 5 cannot be applied in byzantine settings in general, since knowledge of other agents’ faults is usually not attainable [23]. Hence, one has to either resort to a weaker belief-based alternative or else to an important special case of Example 5, namely, self-correction, where \(G=\{a\}\), i.e., agent a diagnoses itself as faulty. This remains feasible in the byzantine setting because one’s own fault is among the few things an agent can know in such systems [23]. We illustrate this in Example 6.
Example 6
(Self-correction under constraints). Self-correction of agent a without constraints is carried out on the condition that a knows he is faulty (\(K_a H_a \bot \)). The hope update formula for self-correction of agent a with an optional additional constraint \(\varphi \) is
where the \(\lnot H_{a}\bot \) part corresponds to the worlds where agent a is already correct and the \(\varphi \wedge K_{a} H_a \bot \) part says that, if he knows that he is faulty (\(K_a H_a \bot \)), then he attempts to self-correct and succeeds if, additionally, a (possibly external) condition \(\varphi \) holds. Very similarly to Example 5 we now add an additional constraint \(\varphi =p_b\). Notice that the update is indeed slightly different than in Example 5, as a no longer becomes correct in world 01.
After the update, we get in states 00 and 10 (where a was correct resp. faulty):
Byzantine Agents. We now turn our attention to a different problem that needs to be solved in fault-tolerant distributed systems like MAFT [21] and GUARDS [30] that combine fault-masking approaches with FDIR. What is needed here is to monitor whether there are at most f faulty agents among the n agents in the system, and take countermeasures when the formula
is in danger of getting violated or even is violated already. The most basic way to enforce the global condition \( Byz _{\!f}\) in a hope update is by a constraint on the hope update formulas, rather than by their actual shape. All that is needed here is to ensure, given hope update formulas \(\vec {\varphi } = (\varphi _1,\dots ,\varphi _n)\), that at least \(n-f\) of those are true, which can be expressed by the formula
We now have the validity
In particular, we also have the weaker \(\models Byz _{\!f}\wedge \vec {\varphi }^{\,n-f} \rightarrow [\vec {\varphi }] Byz _{\!f}\). In other words, \( M,w \models Byz _{\!f}\wedge \vec {\varphi }^{\,n-f}\) implies \(M^{\vec {\varphi }},w \models Byz _{\!f}\). We could also consider generalized schemas such as: \(M \models Byz _{\!f}\wedge \vec {\varphi }^{\,n-f}\) implies \(M^{\vec {\varphi }} \models Byz _{\!f}\). In all these cases, the initial assumption \( Byz _{\!f}\) is superfluous.
Such a condition is, of course, too abstract for practical purposes. What would be needed here are concrete hope update formulas by which we can update a model when \( Byz _{\!f}\) might become false resp. is false already, in which case it must cause the correction of sufficiently many agents to guarantee that \( Byz _{\!f}\) is still true resp. becomes true again after the update. Recall that belief \(B_i\psi \) is defined as \(K_i(\lnot H_i \bot \rightarrow \psi )\). If we define
it easy to see by the pigeonhole principle that \( \models Byz _{\!f}\wedge B_{\ge {f+1}}\psi \rightarrow \psi . \) Using \(\psi = H_a\bot \) will hence result in one fewer faulty agent. To the formula \(B_{\ge {f+1}}H_a\bot \) we add a disjunct \(\lnot H_a \bot \) to ensure correct agents remain correct.
3.3 Axiomatization
Axiomatization \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}}\) of the logical semantics for \(\mathcal {L}_{\textit{KH}}^{\textit{pub}}\) extends axiom system \(\mathscr {K}\!\!\mathscr {H}\) with axioms describing the interaction between hope updates and other logical connectives. The axiomatization is a straightforward reduction system, where the interesting interaction happens in hope update binding hope.
Definition 7
(Axiomatization \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}}\)).\(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}}\) extends \(\mathscr {K}\!\!\mathscr {H}\) with
where \(\vec {\varphi }=(\varphi _1,\dots ,\varphi _n) \in (\mathcal {L}_{\textit{KH}}^{\textit{pub}})^{n}\), \(\vec {\chi }=(\chi _1,\dots ,\chi _n) \in (\mathcal {L}_{\textit{KH}}^{\textit{pub}})^{n}\), \(\psi , \xi \in \mathcal {L}_{\textit{KH}}^{\textit{pub}}\), \(p \in \textsf{Prop}\), and \(i \in \mathcal {A}\).
Theorem 8
(Soundness). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^{\textit{pub}}\), \( \mathscr {K}\!\!\mathscr {H}^{\textit{pub}} \vdash \varphi \) implies \(\mathcal {K}\mathcal {H}\models \varphi \).
Proof
Out of all additional axioms, we only show the most interesting case of hope being updated: we show the validity of \({[\vec {\varphi }]} H_i \psi \leftrightarrow \bigl (\varphi _i \rightarrow K_i(\varphi _i \rightarrow [\vec {\varphi }] \psi )\bigr )\):
\(M, w \models [\vec {\varphi }] H_i \psi \) iff
\(M^{\vec {\varphi }}, w \models H_i \psi \) iff
\(\bigl (\forall v \in \mathcal {H}^{\varphi _i}_i (w)\bigr )\,\, M^{\vec {\varphi }}, v \models \psi \) iff
\( (\forall v \in W) \bigl (v \in \mathcal {K}_i (w) \,\, \& \,\, M, w \models \varphi _i \,\, \& \,\, M, v \models \varphi _i \quad \Longrightarrow \quad M^{\vec {\varphi }}, v \models \psi \bigr )\) iff
\( M, w \models \varphi _i \quad \Longrightarrow \quad (\forall v \in W) \bigl (v \in \mathcal {K}_i (w) \,\, \& \,\, M, v \models \varphi _i \quad \Longrightarrow \quad M^{\vec {\varphi }}, v \models \psi \bigr )\) iff
\(M, w \models \varphi _i \quad \Longrightarrow \quad \bigl (\forall v \in \mathcal {K}_i (w)\bigr ) ( M, v \models \varphi _i \quad \Longrightarrow \quad M^{\vec {\varphi }}, v \models \psi )\) iff
\(M, w \models \varphi _i\quad \Longrightarrow \quad \bigl (\forall v \in \mathcal {K}_i (w)\bigr )(M, v \models \varphi _i\quad \Longrightarrow \quad M, v \models [\vec {\varphi }] \psi )\) iff
\(M, w \models \varphi _i\quad \Longrightarrow \quad \bigl (\forall v \in \mathcal {K}_i (w)\bigr )\,\,M, v \models \varphi _i \rightarrow [\vec {\varphi }] \psi \) iff
\(M, w \models \varphi _i\quad \Longrightarrow \quad M, w \models K_i(\varphi _i \rightarrow [\vec {\varphi }] \psi )\) iff
\(M, w \models \varphi _i \rightarrow K_i(\varphi _i \rightarrow [\vec {\varphi }] \psi ).\) \(\square \)
Every formula in \(\mathcal {L}_{\textit{KH}}^{\textit{pub}}\) is provably equivalent to a formula in \(\mathcal {L}_{\textit{KH}}\) (Lemma 13). To prove this, we first define the weight or complexity of a given formula (Definition 9) and show a number of inequalities comparing the left-hand side to the right-hand side of the reduction axioms in axiomatization \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}}\) (Lemma 10). Subsequently, we define a translation from \(\mathcal {L}_{\textit{KH}}^{\textit{pub}}\) to \(\mathcal {L}_{\textit{KH}}\) (Definition 11) and finally show that the translation is a terminating rewrite procedure (Proposition 12).
Definition 9
The complexity \(c: \mathcal {L}_{\textit{KH}}^{\textit{pub}} \rightarrow \mathbb {N}\) of \(\mathcal {L}_{\textit{KH}}^{pub}\)-formulas is defined recursively, where \(p \in \textsf{Prop}\), \(i \in \mathcal {A}\), and \(c(\vec {\varphi }) :=\max \{ c(\varphi _i) \mid 1 \le i \le n \}\):
Lemma 10
For each axiom \(\theta _l \leftrightarrow \theta _r\) from Definition 7, \(c(\theta _l) > c(\theta _r)\).
Definition 11
The translation \(t: \mathcal {L}_{\textit{KH}}^{pub} \rightarrow \mathcal {L}_{\textit{KH}}\) is defined recursively, where \(p \in \textsf{Prop}\), \(i \in \mathcal {A}\), and the i-th formula of \(\vec {\varphi }\) is \(\varphi _i\):
Proposition 12
(Termination). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^{\textit{pub}}\), \(t(\varphi ) \in \mathcal {L}_{\textit{KH}}\).
Proof
This follows by induction on \(c(\varphi )\). \(\square \)
Lemma 13
(Equiexpressivity). Language \(\mathcal {L}_{\textit{KH}}^{\textit{pub}}\) is equiexpressive with \(\mathcal {L}_{\textit{KH}}\).
Proof
It follows by induction on \(c(\varphi )\) that \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}} \vdash \varphi \leftrightarrow t(\varphi )\) for all \(\varphi \in \mathcal {L}_{\textit{KH}}^{\textit{pub}}\), where, by Proposition 12, \(t(\varphi ) \in \mathcal {L}_{\textit{KH}}\). \(\square \)
Theorem 14
(Soundness and completeness). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^{\textit{pub}}\),
Proof
Soundness was proven in Theorem 8. To prove completeness, assume \(\mathcal {K}\mathcal {H}\models \varphi \). According to Lemma 13, we have \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}} \vdash \varphi \leftrightarrow t(\varphi )\). Therefore, by Theorem 8, \(\mathcal {K}\mathcal {H}\models \varphi \leftrightarrow t(\varphi )\) follows. Since \(\mathcal {K}\mathcal {H}\models \varphi \) (by assumption), we obtain \(\mathcal {K}\mathcal {H}\models t(\varphi )\). By applying Theorem 1, \(\mathscr {K}\!\!\mathscr {H}\vdash t(\varphi )\) further follows. Consequently, \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}} \vdash t(\varphi )\). Finally, since \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}} \vdash \varphi \leftrightarrow t(\varphi )\), \(\mathscr {K}\!\!\mathscr {H}^{\textit{pub}} \vdash \varphi \). \(\square \)
Corollary 15
(Necessitation for public hope updates).
4 Private Hope Update
In the case of the public hope update mechanism introduced in Sect. 3, after the update there is no uncertainty about what happened. In some distributed FDIR schemes, including self-correction, however, the hope update at an agent occurs in a less public way. To increase the application coverage of our logic, we therefore provide the alternative of private hope updates. For that, we use structures inspired by action models. Strictly speaking, such updates are known as semi-private (or semi-public) updates, as the agents are aware of their uncertainty and know what they are uncertain about, whereas in fully private update the agent does not know that the action took place [5] and may, in fact, believe that nothing happened. The resulting language can be viewed as a generalization of \(\mathcal {L}_{\textit{KH}}^{pub}\), where the latter now becomes a special case.
4.1 Syntax and Semantics
Definition 16
(Hope update model). A hope update model for a logical language \(\mathcal {L}\) is a tuple \( U = (E,\vartheta ,\mathcal {K}^{U}), \) where E is a non-empty set of actions, \(\vartheta : E \rightarrow (\mathcal {A}\rightarrow \mathcal {L})\) is a hope update function, and \(\mathcal {K}^U: \mathcal {A}\rightarrow \mathcal {P}(E \times E)\) such that all \(\mathcal {K}^{U}_i\) are equivalence relations. For \(\vartheta (e)(i)\) we write \(\vartheta _i(e)\). As before, formulas \(\vartheta _i(e)\in \mathcal {L}\) are hope update formulas. A pointed hope update model is a pair (U, e) where \(e \in E\).
Definition 17
(Language \(\mathcal {L}_{\textit{KH}}^{priv}\)). We obtain language \(\mathcal {L}_{\textit{KH}}^{priv}\) by adding the construct \([U,e] \varphi \) to BNF (1), where (U, e) is a pointed hope update model.
Definition 17 is given by mutual recursion as usual: all hope update models U are for the language \(\mathcal {L}_{\textit{KH}}^{priv}\).
Definition 18
(Semantics of private hope update). Let \(U = (E,\vartheta ,\mathcal {K}^U)\) be a hope update model, \(M = (W,\pi ,\mathcal {K},\mathcal {H}) \in \mathcal {K}\mathcal {H}\), \(w \in W\), and \(e \in E\). Then:
where \(M \times U = (W^\times ,\pi ^\times ,\mathcal {K}^\times , \mathcal {H}^\times )\) is such that:
Public hope updates can be viewed as singleton hope update models. Given formulas \(\vec {\varphi } \in (\mathcal {L}_{\textit{KH}}^{pub})^n\), define \( \textit{pub}:=(\{e\}, \vartheta , \mathcal {K}^\textit{pub})\), where \(\vartheta _i(e):=\varphi _i\) and \(\mathcal {K}^\textit{pub}:=\{(e,e)\}\).
Difference with Action Models. Although our hope update models look like action models, they are not really action models in the sense of [2]. Our actions do not have executability preconditions, such that the updated model is not a restricted modal product but rather the full product. Another difference is that, by analogy with Kripke models for knowledge and hope, we would then have expected a hope relation in the update models. But there is none in our approach.
Proposition 19
\(M \times U \in \mathcal {K}\mathcal {H}\) for any hope update model U and \(M\in \mathcal {K}\mathcal {H}\).
Proof
The proof is somewhat similar to that of Proposition 4. It is obvious that all \(\mathcal {K}^\times _i\) are equivalence relations. Let us show now that for all \(i \in \mathcal {A}\) relations \(\mathcal {H}^{\times }_i\) are shift-serial and that they satisfy the properties \(\mathcal {H}\textrm{in}\mathcal {K}\) and \(\textrm{one}\mathcal {H}\).
-
\(\mathcal {H}^\times _i\) is shift-serial: Let \((w,e) \in W^\times \). Assume \((w,e) \mathcal {H}^\times _i (v,f)\), that is,\((w,e) \mathcal {K}^\times _i (v,f)\), and \(M,w \models \vartheta _i(e)\), and \(M,v \models \vartheta _i(f)\). \((v,f) \mathcal {K}^\times _i (w,e)\) follows by symmetry of \(\mathcal {K}^\times _i\). Therefore, \(\mathcal {H}^\times _i \bigl ((v,f)\bigr ) \ne \varnothing \) since \((w,e) \in \mathcal {H}^\times _i \bigl ((v,f)\bigr )\).
-
\(\mathcal {H}^\times _i\) satisfies \(\mathcal {H}\textrm{in}\mathcal {K}\): This follows by definition.
-
\(\mathcal {H}^\times _i\) satisfies \(\textrm{one}\mathcal {H}\): Let \((w,e), (v,f) \in W^\times \). Assume that \(\mathcal {H}^\times _i \bigl ((w,e)\bigr ) \ne \varnothing \), \(\mathcal {H}^\times _i \bigl ((v,f)\bigr ) \ne \varnothing \), and \((w,e) \mathcal {K}^\times _i (v,f)\). As \(\mathcal {H}^\times _i \bigl ((w,e)\bigr ) \ne \varnothing \), \(M,w \models \vartheta _i(e)\). As \(\mathcal {H}^\times _i \bigl ((v,f)\bigr ) \ne \varnothing \), \(M,v \models \vartheta _i(f)\). Therefore, \((w,e) \mathcal {H}^\times _i (v,f)\). \(\square \)
Definition 20
Let \(U = (E,\vartheta ,\mathcal {K}^{U})\) and \(U' = (E',\vartheta ',\mathcal {K}^{U'})\) be hope update models. The composition \((U ; U')\) is \((E'',\vartheta '',\mathcal {K}^{U ; U'})\) such that:
Since \(\mathcal {K}^{U}_i\) and \(\mathcal {K}^{U'}_i\) are equivalence relations, \(\mathcal {K}^{U ; U'}_i\) is also an equivalence relation, so that \((U ; U')\) is a hope update model.
4.2 Applications
The arguably most important usage of private updates in distributed FDIR is to express the uncertainty of agents about whether an update affects other agents.
Example 21
(Private correction). We reconsider the example from Sect. 1, only this time we privately correct agent a based on \(p_b\) such that agent b is uncertain whether the hope update happens. This can be modeled by two hope update formulas for agent a: \(\lnot H_a \bot \vee p_b\) and \(\lnot H_a \bot \). With \(\lnot H_a \bot \vee p_b\) we associate an event \(c_{p_b}\) where the correction takes place based on the additional constraint \(p_b\), and with \(\lnot H_a \bot \) we associate an event noc where correction does not take place. Writing \(\vartheta (e) = \bigl ((\vartheta _a(e),\vartheta _b(e)\bigr )\), we get \(U:=(E, \vartheta , \mathcal {K}^{U})\), where:
When naming worlds, we have abstracted away from the event being executed in a world. Having the same name therefore does not mean being the same world. For example, the world \(\mathbb {11}\) at the front of the cube ‘really’ is the pair \((11,c_{p_b})\) with \(H_a\bigl ((11,c_{p_b})\bigr ) \ne \varnothing \) and \(H_b\bigl ((11,c_{p_b})\bigr ) \ne \varnothing \). We now have for example that:
4.3 Axiomatization
Definition 22
(Axiomatization \(\mathscr {K}\!\!\mathscr {H}^\textit{priv}\)). \(\mathscr {K}\!\!\mathscr {H}^\textit{priv}\) consists of \(\mathscr {K}\!\!\mathscr {H}\) and
Theorem 23
(Soundness). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^\textit{priv}\), \(\mathscr {K}\!\!\mathscr {H}^{\textit{priv}} \vdash \varphi \) implies \(\mathcal {K}\mathcal {H}\models \varphi \).
Similarly to the previous section, one can show that every formula in \(\mathcal {L}_{\textit{KH}}^\textit{priv}\) is provably equivalent to a formula in \(\mathcal {L}_{\textit{KH}}\), by defining \(\mathcal {L}_{\textit{KH}}^\textit{priv}\)-formulas complexity, showing complexity inequalities concerning the reduction axioms in axiomatization \(\mathscr {K}\!\!\mathscr {H}^{\textit{priv}}\), defining a translation from \(\mathcal {L}_{\textit{KH}}^\textit{priv}\) to \(\mathcal {L}_{\textit{KH}}\), and observing that this translation is a terminating rewrite procedure. We thus obtain:
Proposition 24
(Termination). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^\textit{priv}\), \(t(\varphi ) \in \mathcal {L}_{\textit{KH}}\).
Lemma 25
(Equiexpressivity). Language \(\mathcal {L}_{\textit{KH}}^\textit{priv}\) is equiexpressive with \(\mathcal {L}_{\textit{KH}}\), i.e., for all \(\varphi \in \mathcal {L}_{\textit{KH}}^\textit{priv}\), \(\mathscr {K}\!\!\mathscr {H}^{\textit{priv}} \vdash \varphi \leftrightarrow t(\varphi )\).
Theorem 26
(Soundness and completeness). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^\textit{priv}\),
Necessitation for private hope update is an admissible inference rule in \(\mathscr {K}\!\!\mathscr {H}^{\textit{priv}}\).
5 Factual Change
In this section, we provide a way to add factual change to our model updates. This is going along well-trodden paths in dynamic epistemic logic [3, 8, 9].
5.1 Syntax, Semantics, and Axiomatization
Definition 27
(Hope update model with factual change). To obtain a hope update model with factual change \(U = (E,\vartheta ,\sigma ,\mathcal {K}^U)\) from a hope update model \((E,\vartheta ,\mathcal {K}^U)\) for a language \(\mathcal {L}\) we add parameter \(\sigma : E \rightarrow (\textsf{Prop}\rightarrow \mathcal {L})\). We require that each \(\sigma (e)\) is only finitely different from the identity function.
The finitary requirement is needed in order to keep the language well-defined. In this section, by hope update models we mean hope update models with factual change.
Definition 28
(Language \(\mathcal {L}_{\textit{KH}}^f\)). Language \(\mathcal {L}_{\textit{KH}}^f\) is obtained by adding the construct \([U,e] \varphi \) to the BNF of the language \(\mathcal {L}_{\textit{KH}}\), where (U, e) is a pointed hope update model with factual change for the language \(\mathcal {L}_{\textit{KH}}^f\).
As in the previous section, Definition 28 is given by mutual recursion and from here on all hope update models are for language \(\mathcal {L}_{\textit{KH}}^f\).
Definition 29
(Semantics). Let \(U = (E,\vartheta ,\sigma ,\mathcal {K}^U)\), \(M = (W,\pi ,\mathcal {K}, \mathcal {H}) \in \mathcal {K}\mathcal {H}\), \(w \in W\), and \(e \in E\). Then, as in Definition 18, \(M,w \models [U,e]\varphi \) iff \(M \times U, (w,e) \models \varphi \), only now \(M \times U = (W^\times ,\pi ^\times ,\mathcal {K}^\times , \mathcal {H}^\times )\) is such that:
The only difference between Definitions 18 and 29 is that the clause for the valuation of the former is: \((w,e) \in \pi ^\times (p)\) iff \(w \in \pi (p)\). In other words, then the valuation of facts does not change, and the valuation in the world w is carried forward to that in the updated worlds (w, e). It is easy to see that \(\mathcal {K}\mathcal {H}\models [U,e]p \leftrightarrow \sigma (e)(p)\), as we immediately obtain that: \(M, w \models [U,e]p\) iff \(M \times U, (w,e) \models p\) iff \((w,e) \in \pi ^\times (p)\) iff \(M,w \models \sigma (e)(p)\). This turns out to be the only difference:
Definition 30
(Axiomatization \(\mathscr {K}\!\!\mathscr {H}^{f}\)). Axiom system \(\mathscr {K}\!\!\mathscr {H}^{f}\) is obtained from \(\mathscr {K}\!\!\mathscr {H}^\textit{priv}\) by replacing the first equivalence in Definition 22 with \([U,e]p \leftrightarrow \sigma (e)(p)\).
Theorem 31
(Soundness). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^f\), \(\mathscr {K}\!\!\mathscr {H}^f \vdash \varphi \) implies \(\mathcal {K}\mathcal {H}\models \varphi \).
In itself it is quite remarkable that the required changes are fairly minimal, given the enormously enhanced flexibility in specifying distributed system behavior. With techniques quite similar to those employed for the hope update model logic without factual change, we can also get completeness for the hope update logic with factual change. Lacking space did not allow us to include many of the details; the interested reader is referred to the extended version [7] of this paper.
Lemma 32
(Equiexpressivity). Language \(\mathcal {L}_{\textit{KH}}^f\) is equiexpressive with \(\mathcal {L}_{\textit{KH}}\).
Theorem 33
(Soundness and completeness). For all \(\varphi \in \mathcal {L}_{\textit{KH}}^f\),
5.2 Applications
The importance of adding factual change to our framework comes from the fact that, in practical protocols implementing FDIR mechanisms, agents usually take decisions based on what they recorded in their local states. We demonstrate the essentials of combined hope updates and state recovery in Example 34, which combines the variant of self-correction introduced in Example 6 with state recovery needs that would arise in the alternating bit protocol [19].
Example 34
(Private self-correction with state recovery). The alternating bit protocol (ABP) for transmitting an arbitrarily generated stream of consecutive data packets from a sender to a receiver over an unreliable communication channel uses messages that additionally contain a sequence number consisting of 1 bit only. The latter switches from one message to the next, by alternating atomic propositions \(q_s\) and \(q_r\) containing the next sequence number to be used for the next message generated by the sender resp. receiver side of the channel. In addition, the ABP maintains atomic propositions \(p_s\) and \(p_r\) holding the last sequence number used by sender resp. receiver side. In more detail, the sending of data packet \(d_n\), starting from \((q_s,q_r)=(0,0)\) and \((p_s,p_r)=(1,1)\), is completed in three phases ([19]): (i) if \(q_s\ne p_s\), sender s sets \(p_s :=q_s=0\) and generates a message \((d_n,p_s)\) to be repeatedly sent; (ii) when receiver r receives \((d_n,q_r)\) (with \(q_r=0\) here), it records \(d_n\), sets \(p_r :=q_r=0\), generates a message \((ack,p_r)\) to be repeatedly sent back to s, and switches to the next sequence number \(q_r:=1\); (iii) if sender s receives \((ack,p_s)\) (with \(p_s=0\) here), it switches to the next sequence number \(q_s:=\lnot p_s=1\). Note that the next sequence numbers \((q_s,q_r)\) have moved from (0, 0) via (0, 1) to (1, 1), whereas the last sequence numbers \((p_s,p_r)\) moved from (1, 1) to (0, 1) to (0, 0). From here, the above phases are just repeated (with all sequence numbers flipped) for sending \(d_{n+1}\). Thus, during a correct run of the ABP, \((q_s,q_r)\) continuously cycles through (0, 0), (0, 1), (1, 1), (1, 0), (0, 0), ....
If, however, a transient fault would flip the value of either \(q_s\) or \(q_r\), the ABP deadlocks and therefore requires correction. Due to the asymmetry of the ABP regarding sender and receiver, the need for a correction of the receiver can be conveniently determined by checking the equality of \(p_r\) and \(q_r\), and can be performed by just setting \(q_r:=\lnot p_r\).
We model agent r successfully self-correcting and recovering its state from \(p_r = q_r\), that is, based on \(p_r \leftrightarrow q_r\). At the same time, s is uncertain whether r has corrected itself (event \(scr_{p_r = q_r}\)) or not (event noscr). Again writing \(\vartheta (e)\) as \(\bigl ((\vartheta _a(e),\vartheta _b(e)\bigr )\), this is encoded in the hope update model \(U:=(E, \vartheta , \sigma , \mathcal {K}^{U})\), where:
Note that \(H_r \bot \) is equivalent to \(p_r \leftrightarrow q_r\), making \(H_r \bot \) locally detectable by r and resulting in \(\vartheta (scr_{p_r = q_r}) = (\lnot H_s \bot , H_r \bot )\). All atoms for noscr and all atoms other than \(q_r\) for \(scr_{p_r = q_r}\) remain unchanged. Coding the atoms in each state as \(p_sq_s.p_rq_r\), the resulting update is:
The only change happens in global states \(\mathbb {00}.00\) and \(\mathbb {01}.00\) where \(p_r \leftrightarrow q_r\) causes the hope update and \(q_r\) is set to be the opposite of \(p_r\). After the update, we get:
6 Conclusions and Further Research
We gave various dynamic epistemic semantics for the modeling and analysis of byzantine fault-tolerant multi-agent systems, expanding a known logic containing knowledge and hope modalities. We provided complete axiomatizations for our logics and applied them to fault-detection, isolation, and recovery (FDIR) in distributed computing. For future research we envision alternative dynamic epistemic update mechanisms, as well as embedding our logic into the (temporal epistemic) runs-and-systems approach.
Notes
- 1.
An extended version of the paper, which also provides the proofs and additional details that had to be dropped here, can be found in [7].
References
Adams, J.C., Ramarao, K.V.S.: Distributed diagnosis of byzantine processors and links. In: Proceedings, The 9th International Conference on Distributed Computing Systems: Newport Beach, California, 5–9 June 1989, pp. 562–569. IEEE (1989). https://doi.org/10.1109/ICDCS.1989.37989
Baltag, A., Moss, L.S., Solecki, S.: The logic of public announcements, common knowledge, and private suspicions. In: Gilboa, I. (ed.) Theoretical Aspects of Rationality and Knowledge: Proceedings of the Seventh Conference (TARK 1998), pp. 43–56. Morgan Kaufmann (1998). http://tark.org/proceedings/tark_jul22_98/p43-baltag.pdf
van Benthem, J., van Eijck, J., Kooi, B.: Logics of communication and change. Inf. Comput. 204(11), 1620–1662 (2006). https://doi.org/10.1016/j.ic.2006.04.006
van Benthem, J., Liu, F.: Dynamic logic of preference upgrade. J. Appl. Non-Classical Logics 17(2), 157–182 (2007). https://doi.org/10.3166/jancl.17.157-182
van Ditmarsch, H.: Description of game actions. J. Logic Lang. Inform. 11(3), 349–365 (2002). https://doi.org/10.1023/A:1015590229647
van Ditmarsch, H., Fruzsa, K., Kuznets, R.: A new hope. In: Fernández-Duque, D., Palmigiano, A., Pinchinat, S. (eds.) Advances in Modal Logic, vol. 14, pp. 349–369. College Publications (2022). http://www.aiml.net/volumes/volume14/22-vanDitmarsch-Fruzsa-Kuznets.pdf
van Ditmarsch, H., Fruzsa, K., Kuznets, R., Schmid, U.: A logic for repair and state recovery in byzantine fault-tolerant multi-agent systems. Eprint 2401.06451, arXiv (2024). https://doi.org/10.48550/arXiv.2401.06451
van Ditmarsch, H., van der Hoek, W., Kooi, B.: Dynamic epistemic logic with assignment. In: AAMAS 2005: Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 141–148. Association for Computing Machinery (2005). https://doi.org/10.1145/1082473.1082495
van Ditmarsch, H., Kooi, B.: Semantic results for ontic and epistemic change. In: Bonanno, G., van der Hoek, W., Wooldridge, M. (eds.) Logic and the Foundations of Game and Decision Theory (LOFT 7). Texts in Logic and Games, vol. 3, pp. 87–118. Amsterdam University Press (2008). https://www.jstor.org/stable/j.ctt46mz4h.6
Dolev, D., Függer, M., Posch, M., Schmid, U., Steininger, A., Lenzen, C.: Rigorously modeling self-stabilizing fault-tolerant circuits: an ultra-robust clocking scheme for systems-on-chip. J. Comput. Syst. Sci. 80(4), 860–900 (2014). https://doi.org/10.1016/j.jcss.2014.01.001
van Eijck, J.: DEMO—a demo of epistemic modelling. In: van Benthem, J., Gabbay, D., Löwe, B. (eds.) Interactive Logic: Selected Papers from the 7th Augustus de Morgan Workshop, London. Texts in Logic and Games, vol. 1, pp. 303–362. Amsterdam University Press (2007). https://www.jstor.org/stable/j.ctt45kdbf.15
Elnozahy, E.N.M., Alvisi, L., Wang, Y.-M., Johnson, D.B.: A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv. 34(3), 375–408 (2002). https://doi.org/10.1145/568522.568525
Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Reasoning About Knowledge. MIT Press, Cambridge (1995). https://doi.org/10.7551/mitpress/5803.001.0001
Fruzsa, K.: Hope for epistemic reasoning with faulty agents! In: Pavlova, A., Pedersen, M.Y., Bernardi, R. (eds.) ESSLLI 2019. LNCS, vol. 14354, pp. 93–108. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-50628-4_6
Fruzsa, K., Kuznets, R., Schmid, U.: Fire! In: Halpern, J., Perea, A. (eds.) Proceedings of the Eighteenth Conference on Theoretical Aspects of Rationality and Knowledge, Beijing, China, 25–27 June 2021. Electronic Proceedings in Theoretical Computer Science, vol. 335, pp. 139–153. Open Publishing Association (2021). https://doi.org/10.4204/EPTCS.335.13
Függer, M., Schmid, U.: Reconciling fault-tolerant distributed computing and systems-on-chip. Distrib. Comput. 24(6), 323–355 (2012). https://doi.org/10.1007/s00446-011-0151-7
Gammie, P., van der Meyden, R.: MCK: model checking the logic of knowledge. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 479–483. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27813-9_41
Halpern, J.Y., Moses, Y.: Knowledge and common knowledge in a distributed environment. J. ACM 37(3), 549–587 (1990). https://doi.org/10.1145/79147.79161
Halpern, J.Y., Zuck, L.D.: A little knowledge goes a long way: knowledge-based derivations and correctness proofs for a family of protocols. J. ACM 39(3), 449–478 (1992). https://doi.org/10.1145/146637.146638
Heuerding, A., Jäger, G., Schwendimann, S., Seyfried, M.: A Logics Workbench. AI Commun. 9(2), 53–58 (1996). https://doi.org/10.3233/AIC-1996-9203
Kieckhafer, R.M., Walter, C.J., Finn, A.M., Thambidurai, P.M.: The MAFT architecture for distributed fault tolerance. IEEE Trans. Comput. 37(4), 398–404 (1988). https://doi.org/10.1109/12.2183
Kuznets, R., Prosperi, L., Schmid, U., Fruzsa, K.: Causality and epistemic reasoning in byzantine multi-agent systems. In: Moss, L.S. (ed.) Proceedings of the Seventeenth Conference on Theoretical Aspects of Rationality and Knowledge, Toulouse, France, 17–19 July 2019. Electronic Proceedings in Theoretical Computer Science, vol. 297, pp. 293–312. Open Publishing Association (2019). https://doi.org/10.4204/EPTCS.297.19
Kuznets, R., Prosperi, L., Schmid, U., Fruzsa, K.: Epistemic reasoning with byzantine-faulty agents. In: Herzig, A., Popescu, A. (eds.) FroCoS 2019. LNCS (LNAI), vol. 11715, pp. 259–276. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29007-8_15
Kuznets, R., Prosperi, L., Schmid, U., Fruzsa, K., Gréaux, L.: Knowledge in Byzantine message-passing systems I: Framework and the causal cone. Technical Report TUW-260549, TU Wien (2019). https://publik.tuwien.ac.at/files/publik_260549.pdf
Lamport, L., Shostak, R., Pease, M.: The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982). https://doi.org/10.1145/357172.357176
Mendes, H., Tasson, C., Herlihy, M.: Distributed computability in Byzantine asynchronous systems. In: STOC 2014, 46th Annual Symposium on the Theory of Computing: 31 May–3 June 2014, New York, New York, USA, pp. 704–713. Association for Computing Machinery (2014). https://doi.org/10.1145/2591796.2591853
Mitchell, J.C., Moggi, E.: Kripke-style models for typed lambda calculus. Ann. Pure Appl. Logic 51(1–2), 99–124 (1991). https://doi.org/10.1016/0168-0072(91)90067-V
Moses, Y.: Relating knowledge and coordinated action: the Knowledge of Preconditions principle. In: Ramanujam, R. (ed.) Proceedings Fifteenth Conference on Theoretical Aspects of Rationality and Knowledge, Carnegie Mellon University, Pittsburgh, USA, 4–6 June 2015. Electronic Proceedings in Theoretical Computer Science, vol. 215, pp. 231–245. Open Publishing Association (2016). https://doi.org/10.4204/EPTCS.215.17
Pessin, A., Goldberg, S.: The Twin Earth Chronicles: Twenty Years of Reflection on Hilary Putnam’s “The Meaning of ‘Meaning’". M. E. Sharpe (1995). https://doi.org/10.4324/9781315284811
Powell, D., et al.: GUARDS: a generic upgradable architecture for real-time dependable systems. IEEE Trans. Parallel Distrib. Syst. 10(6), 580–599 (1999). https://doi.org/10.1109/71.774908
Robinson, P., Schmid, U.: The Asynchronous Bounded-Cycle model. Theoret. Comput. Sci. 412(40), 5580–5601 (2011). https://doi.org/10.1016/j.tcs.2010.08.001
Rushby, J.: Reconfiguration and transient recovery in state machine architectures. In: Proceedings of the Twenty-Sixth International Symposium on Fault-Tolerant Computing: 25–27 June 1996, Sendai, Japan, pp. 6–15. IEEE (1996). https://doi.org/10.1109/FTCS.1996.534589
Schlögl, T., Schmid, U.: A sufficient condition for gaining belief in byzantine fault-tolerant distributed systems. In: Verbrugge, R. (ed.) Proceedings of the Nineteenth conference on Theoretical Aspects of Rationality and Knowledge, Oxford, United Kingdom, 28–30th June 2023. Electronic Proceedings in Theoretical Computer Science, vol. 379, pp. 487–497. Open Publishing Association (2023). https://doi.org/10.4204/EPTCS.379.37
Schlögl, T., Schmid, U., Kuznets, R.: The persistence of false memory: brain in a vat despite perfect clocks. In: Uchiya, T., Bai, Q., Marsá Maestre, I. (eds.) PRIMA 2020. LNCS, vol. 12568, pp. 403–411. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69322-0_30
Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22(4), 299–319 (1990). https://doi.org/10.1145/98163.98167
Srikanth, T.K., Toueg, S.: Optimal clock synchronization. J. ACM 34(3), 626–645 (1987). https://doi.org/10.1145/28869.28876
Srikanth, T.K., Toueg, S.: Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distrib. Comput. 2(2), 80–94 (1987). https://doi.org/10.1007/BF01667080
Walter, C.J., Lincoln, P., Suri, N.: Formally verified on-line diagnosis. IEEE Trans. Software Eng. 23(11), 684–721 (1997). https://doi.org/10.1109/32.637385
Widder, J., Schmid, U.: The Theta-Model: achieving synchrony without clocks. Distrib. Comput. 22(1), 29–47 (2009). https://doi.org/10.1007/s00446-009-0080-x
Acknowledgments
We thank the anonymous reviewers for the suggestions on how to improve the paper. We are grateful for multiple fruitful discussions with and enthusiastic support from Giorgio Cignarale, Stephan Felber, Rojo Randrianomentsoa, Hugo Rincón Galeana, and Thomas Schlögl.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this paper
Cite this paper
van Ditmarsch, H., Fruzsa, K., Kuznets, R., Schmid, U. (2024). A Logic for Repair and State Recovery in Byzantine Fault-Tolerant Multi-agent Systems. In: Benzmüller, C., Heule, M.J., Schmidt, R.A. (eds) Automated Reasoning. IJCAR 2024. Lecture Notes in Computer Science(), vol 14740. Springer, Cham. https://doi.org/10.1007/978-3-031-63501-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-63501-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63500-7
Online ISBN: 978-3-031-63501-4
eBook Packages: Computer ScienceComputer Science (R0)