more

2024-12-02 09:54:54 -05:00
parent cd55d288e9
commit b636781367
14 changed files with 2196 additions and 1964 deletions
--- a/sections/case_studies.tex
+++ b/sections/case_studies.tex
@@ -84,12 +84,14 @@ Referencing the original Raft thesis \cite{Ongaro} and other raft models \cite{W
 \phi_5 &= \text{\parbox[t]{20em}{If any two servers commit the same log entry, the log entry at the previous index must be equivalent}}
 \end{aligned}
 \]
-We construct our Raft model such that we can model-check an arbitrary number of peers. We also designed our model such that each peer maintains separate channels for receiving AppendEntry requests, AppendEntry responses, RequestVote requests, and RequestVote responses. This gives \korg ample handle to reason about Raft. In particular, we study Raft in the presence of drop and replay attackers on all four aforementioned channel types, attacking both a minority and majority of peers. A breakdown of our findings is shown in Figure \ref{res:raft-table}.
+We construct our Raft model such that we can model-check an arbitrary number of peers. We also designed our model such that each peer maintains separate channels for receiving AppendEntry requests, AppendEntry responses, RequestVote requests, and RequestVote responses. This gives \korg ample handle to reason about Raft. In particular, we study Raft in the presence of drop and replay attackers on all four aforementioned channel types, attacking both a minority and majority of peers. 
+
+To test \korg, we introduce a subtle bug in the Raft consensus mechanism: not ensuring votes come from unique peers. A breakdown of our findings is shown in Figure \ref{res:raft_table}.

 \begin{figure}[h!]
+\label{res:raft_table}
 \centering
 \begin{scriptsize}
-\label{res:raft-table}
 \begin{tabular}{|c|c|}
 \hline
 Scenario & Attack found? \\
@@ -104,12 +106,10 @@ Dropping AppendEntryResponse messages & no \\
 \end{tabular}
 \end{scriptsize}
 \caption{Breakdown of the attacker scenarios assessed with \korg against our Raft \promela model. In all experiments, Raft was set to five peers and the drop/replay limits of the gadgets \korg synthesized were set to two. We conducted our experiments on a research computing cluster, allocating 250GB of memory to each verification run. The full models and attacker traces are included in the artifact.}
-%\caption{Automatically discovered attacks against 
-%the hand-written TCP model from Pacheco et al.  and our own, 
-%our TCP model for $\phi_1$ through $\phi_4$. "x" indicates an attack was discovered, and no "x" indicates \korg proved the absence of an attack via an exhaustive search. These experiments were ran on a laptop with an eighth generation i7 and 16gb of memory. Full attack traces are available in the artifact.}
+\label{res:raft_table}
 \end{figure}
-In our experiments, we found just one attack on our Raft \promela model, violating election safety in particular. In this scenario, peer A and peer B are candidates for election. Peer A receives three votes, one from itself and two from other peers, and Peer B receives two votes, one from itself and one from another peer. The replay attacker simply replays the vote sent to peer B. Then, both Peer A and Peer B are convinced they won the election and change their state to leader. Following this, leader completeness is also naturally violated.
+In our experiments, we found just one attack on our Raft \promela model, violating election safety in particular. In this scenario, peer A and peer B are candidates for election. Peer A receives three votes, one from itself and two from other peers, and Peer B receives two votes, one from itself and one from another peer. The replay attacker simply replays the vote sent to peer B. Then, both Peer A and Peer B are convinced they won the election and change their state to leader. Following this, leader completeness is also naturally violated. In this scenario, \korg demonstrates its ability to discover subtle bugs in protocol logic; our Raft model satisfies $\phi_1$-$\phi_5$ assuming perfect channels, and \korg allowed us to reason precisely about the effect of imperfect, vulnerable channels.

-To be clear, this is not an attack on the general Raft protocol, but rather an attack on our specific Raft implementation: in this case, the bug \korg exploits involves our Raft model not ensuring votes received are from unique peers\footnote{Naturally, this requires cryptography and therefore is challenging to express in the semantics of \promela.}. In general, the complete Raft protocol has been proven to resist drop and replay attackers \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016}. In this scenario, \korg demonstrates its ability to discover subtle bugs in protocol logic; our Raft model satisfies $\phi_1$-$\phi_5$ assuming perfect channels, and \korg allowed us to reason precisely about the effect of imperfect, vulnerable channels.

+%To be clear, this is not an attack on the general Raft protocol, but rather an attack on our specific Raft implementation: in this case, the bug \korg exploits involves our Raft model not ensuring votes received are from unique peers\footnote{Naturally, this requires cryptography and therefore is challenging to express in the semantics of \promela.}. In general, the complete Raft protocol has been proven to resist drop and replay attackers \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016}. 
 % We note our analysis is in no 
--- a/sections/design.tex
+++ b/sections/design.tex
@@ -19,7 +19,10 @@ As aforementioned, \korg is based on \textit{LTL attack synthesis}; in particula

 %The methodology behind the construction of \korg is based on \textit{LTL attack synthesis}. 

-\korg is designed to target user-specified communication channels in programs written in \promela, the modeling language of the \spin model checker. The user inputs a \promela model, their desired communication channels to attack, the attacker model of choice, and the LTL correctness property of choice. \korg then invokes \spin, which exhaustively searches for attacks with respect to the chosen attacker model, \promela model, and correctness property. 
+\korg is designed to target user-specified communication channels in programs written in formal models. The user inputs a formal model of choice, their desired communication channels to attack, the attacker model of choice, and the correctness property of choice. \korg then invokes the model checker, which exhaustively searches for attacks with respect to the chosen attacker model, formal protocol model, and the correctness property.
+
+%\promela, the modeling language of the \spin model checker. The user inputs a \promela model, 
+%their desired communication channels to attack, the attacker model of choice, and the LTL correctness property of choice. \korg then invokes \spin, which exhaustively searches for attacks with respect to the chosen attacker model, \promela model, and correctness property. 
 A high-level overview of the \korg pipeline is given in the Figure \ref{fig:korg_workflow}.

 \begin{figure*}[h]
@@ -42,7 +45,7 @@ A high-level overview of the \korg pipeline is given in the Figure \ref{fig:korg
  %\item \textbf{Insert Attacker Model}. Insert attackers are capable of inserting arbitrary messages (as specifiable by the user) onto a channel.
 %\end{itemize}

-\korg supports four general attacker model gadgets: an attacker that can drop, replay, reorder, or insert messages on a channel. In this section we discuss the various details that went into the implementation of the gadgets that encapsulate the behavior of the respective attacker models.
+\korg supports four general attacker models: an attacker that can drop, replay, reorder, or insert messages on a channel. In this section we discuss the various details that went into the implementation of the gadgets that encapsulate the behavior of the respective attacker models.

 % Additionally, \korg supports user-defined attacker that insert arbitrary messages onto a channel. In this section we discuss the various details that go into each attacker model.

@@ -101,7 +104,9 @@ These attacker models can be mixed and matched as desired by the \korg user. For
 \subsection{\korg Implementation}%
 \label{sub:impl}

-We implemented \korg on top of the \spin, a popular and robust model checker for reasoning about distributed and concurrent systems. Intuitively, models written in \promela, the modeling language of \spin, are communicating state machines whose messages are passed over defined \textit{channels}. Channels in \promela can either be unbuffered \textit{synchronous} channels, or buffered \textit{asynchronous} channels. \korg generates attacks \textit{with respect} to these defined channels.
+We implemented \korg on top of the \spin, a popular and robust model checker for reasoning about distributed and concurrent systems. \spin has existed for over 40 years, and has been applied to dozens of real systems including the Mars Rover \cite{Holzmann_2014}, Path-Star Access server \cite{Holzmann_Smith_2000}, and an avionics operating system \cite{mcp}. Additionally, \spin has spawned a dedicated formal methods symposium, currently in its 32nd year\footnote{\url{https://spin-web.github.io/SPIN2025/}}, and earned the 2002 ACM Software System award.
+
+Intuitively, models written in \promela, the modeling language of \spin, are communicating state machines whose messages are passed over defined \textit{channels}. Channels in \promela can either be unbuffered \textit{synchronous} channels, or buffered \textit{asynchronous} channels. \korg generates attacks \textit{with respect} to these defined channels.

 \begin{lstlisting}[caption={Example \promela model of peers communicating over a channel. \texttt{!} indicates sending a message onto a channel, \texttt{?} indicates receiving a message from a channel.}, label={lst:spin-model}]
 // channel of buffer size 0
--- a/sections/introduction.tex
+++ b/sections/introduction.tex
@@ -4,6 +4,9 @@ Distributed protocols are the foundation for the modern internet, and therefore
 This myriad of formal methods tooling applicable to secure protocols has enabled reasoning about security-relevant properties involving secrecy, authentication, indistinguishability in addition to concurrency, safety, and liveness. However, no previous formal methods tooling offered an effective solution for rigorously studying an attacker that controls communication channels. That is, how do you reason about an attacker that can arbitrarily drop, reorder, replay, or insert messages onto a communication channel? 

 To fill this gap, we introduce \korg \footnote{\korg is a fictitious name for our system, for double-blind submission.}, a tool for synthesizing attacks on distributed protocols that implements and extends the theoretical framework proposed in  \cite{Hippel2022_anonym}. In particular, \korg targets the communication channels between the protocol endpoints, and synthesizes attacks to violate arbitrary linear temporal logic (LTL) specifications. \korg either synthesizes attack, or proves the absence of such via an exhaustive state-space search. \korg is sound and complete, meaning if there exists an attack \korg will find it, and \korg will never have false positives. \korg supports pre-defined attacker models, including attackers that can replay, reorder, or drop messages on channels, as well as custom user-defined attacker models. Although \korg best lends itself for reasoning about denial of service attacks, it can target any specification expressable in LTL. 
+
+In this work we take an approach rooted in \textit{formal methods} and \textit{automated reasoning} to construct \korg. In particular, we employ \textit{model checking}, a sub-discipline of formal methods, to decidably and automatically find attacks in protocols or prove the absence of such. 
+
 We summarize our contributions:
 \begin{itemize}
 \item We present \korg,  a tool for synthesizing attacks against communication protocols. \korg supports four general attacker model gadgets: an attacker that can drop, replay, reorder, or insert messages on a channel. 
--- a/sections/proofs.tex
+++ b/sections/proofs.tex
@@ -1,36 +1,115 @@
-\subsection{Soundness And Completeness of \korg}%
-\label{sub:Soundness And Completeness}
+\korg is an implementation of the theoretical attack synthesis framework proposed by \cite{Hippel2022_anonym}. This framework enjoys soundness and completeness guarantees for attacks discovered; that is, if there exists an attack, it is discovered, and if an attack is discovered, it is valid. However, the attack synthesis framework proposed by \cite{Hippel2022_anonym} reasons about an abstracted, theoretical process construct. Therefore, in order to correctly claim \korg is also sound and complete, it is necessary to demonstrate discovering an attack within the theoretical framework reduces to the semantics of \spin, the model checker \korg is built on top of.

-\newcommand{\comp}{\mid\mid}
-\newcommand{\ioint}{\mathcal{C}}
+There exists a semantic gap between the theoretical attack synthesis framework proposed by \cite{Hippel2022_anonym}, and the semantics of \korg. Therefore, in order to correctly claim \korg maintains the soundness and completeness of the theoretical framework it implements, it suffices to demonstrate finding an attack within the theoretical attack synthesis framework precisely reduces to the semantics of \spin.
+%the model checker \korg is implemented on top of.

-Fundamentally, the theoretical framework that \korg implements was presented in \cite{Hippel2022_anoym} about \textit{communicating processes}; similarly, \korg is best understood as a synthesizer for attackers that sit \textit{between} communicating processes. 
+\begin{definition}[\ba]
+A \ba is a tuple \( B = (Q, \Sigma, \delta, Q_0, F) \) where:
+\begin{itemize}
+    \item \( Q \) is a finite set of states,
+    \item \( \Sigma \) is a finite alphabet,
+    \item \( \delta \subseteq Q \times \Sigma \times Q \) is a transition relation,
+    \item \( Q_0 \subseteq Q \) is a set of initial states,
+    \item \( F \subseteq Q \) is a set of accepting states.
+\end{itemize}
+A run of a \ba is an infinite sequence of states \( q_0, q_1, q_2, \ldots \) such that \( q_0 \in Q_0 \) and \( (q_i, a, q_{i+1}) \in \delta \) for some \( a \in \Sigma \) at each step \( i \). The run is considered accepting if it visits states in \( F \) infinitely often.
+\end{definition}

-The theoretical attack synthesis framework and \korg use slightly different formalisms. Both employ derivations the general \textit{Input/Output (I/O) automata}, state machines whose transitions indicate sending or receiving a message.\footnote{
-A fundamental assumption both \korg and the theoretical attack synthesis framework rely upon is unicast transition relations of I/O automata within this context. That is, if one sending automata has an output transition matching an input transition of two receiving automata, only one input/output transition pair can be composed upon. Model checkers for I/O automata such as \spin will explore both possibilities.
-} 
-In particular, the theoretical attack synthesis framework defines their own notion of a \textit{process} and argues their attack synthesis algorithm maintains soundness and completeness guarantees with respect to it, while \korg relies upon \spin's preferred model checking formalism, the B\"uchi Automata. Both utilize linear temporal logic as their specification language of choice.
-
-We ultimately seek to conclude \korg maintains the guarantees of the theoretical framework it implements, therefore it is necessary to demonstrate the equivalence of \textit{processes} from the theoretical attack synthesis framework with the B\"uchi Automata. For ease of reading and clarity, we only provide shortened narrations of the arguments here. The detailed, definitions, theorems, and proofs are provided in Appendix Section \ref{sub:korg_proofs}.
+\begin{definition}[Process]
+A \emph{Process} is a tuple \( P = \langle AP, I, O, S, s_0, T, L \rangle \), where:
+\begin{itemize}
+    \item \( AP \) is a finite set of atomic propositions,
+    \item \( I \) is a set of inputs,
+    \item \( O \) is a set of output, such that \( I \cap O = \emptyset \),
+    \item \( S \) is a finite set of states,
+    \item \( s_0 \in S \) is the initial state,
+    \item \( T \subseteq S \times (I \cup O) \times S \) is the transition relation,
+    \item \( L: S \to 2^{AP} \) is a labeling function mapping each state to a subset of atomic propositions.
+\end{itemize}
+A transition \( (s, x, s') \in T \) is called an \emph{input transition} if \( x \in I \) and an \emph{output transition} if \( x \in O \).
+\end{definition}

+\setcounter{theorem}{0} 
 \begin{theorem}
-  A process, always directly corresponds to a B\"uchi Automata. 
+  A process, as defined in \cite{Hippel2022_anonym}, always directly corresponds to a \ba.
 \end{theorem}

-In short, a process in the theoretical attack synthesis framework is a Kripke Structure equipped with input and output transitions. That is, when composing two processes, an output transition must be matched to a respective input transition. Processes also include atomic propositions, which the given linear temporal logic specifications are defined over. We invoke and build on the well-known correspondence between Kripke Structures and \ba to show our desired correspondence.
+\begin{proof}
+
+Given a \ba \( B = (Q, \Sigma, \delta, Q_0, F) \), we construct a corresponding Process \( P = \langle AP, I, O, S, s_0, T, L \rangle \) as follows:
+
+\begin{itemize}
+    \item Atomic Propositions: \( AP = \{ \text{accept} \} \), a singleton set containing a special proposition indicating acceptance.
+    \item Inputs and Outputs: \( I = \Sigma \) and \( O = \emptyset \).
+    \item States: \( S = Q \) and \( s_0 \in Q_0 \).
+    \item Transition Relation: \( T = \delta \).
+    \item Labeling Function: \( L: S \to 2^{AP} \) defined by
+\end{itemize}
+\[        
+  L(s) = 
+  \begin{cases}            
+  \{ \text{accept} \} & \text{if } s \in F, \\            
+  \emptyset & \text{otherwise}.        
+  \end{cases}    
+\]
+
+In this mapping, the states and transitions of the BA are preserved in the Process, and the accepting states \( F \) are identified via the labeling function \( L \).
+
+Conversely, given a Process \( P = \langle AP, I, O, S, s_0, T, L \rangle \) with an acceptance condition defined by a distinguished proposition \( p \in AP \), we define a \ba \( B = (Q, \Sigma, \delta, Q_0, F) \) as follows:
+
+\begin{itemize}
+    \item States: \( Q = S \) and \( Q_0 = \{ s_0 \} \).
+    \item Alphabet: \( \Sigma = I \cup O \).
+    \item Transition Relation: \( \delta = T \).
+    \item Accepting States: \( F = \{ s \in S \mid p \in L(s) \} \).
+\end{itemize}
+
+Here, the accepting states in the BA correspond to those states in the Process that are labeled with the distinguished proposition \( p \).
+
+In both structures, a run is an infinite sequence of states connected by transitions:
+
+\begin{itemize}
+    \item In the \ba: \( q_0, q_1, q_2, \ldots \) with \( q_0 \in Q_0 \) and \( (q_i, a_i, q_{i+1}) \in \delta \) for some \( a_i \in \Sigma \).
+    \item In the Process: \( s_0, s_1, s_2, \ldots \) with \( s_0 = s_0 \) and \( (s_i, x_i, s_{i+1}) \in T \) for some \( x_i \in I \cup O \).
+\end{itemize}
+
+An accepting run in the \ba visits states in \( F \) infinitely often. Similarly, an accepting run in the Process visits states labeled with \( p \) infinitely often. Since \( F = \{ s \in S \mid p \in L(s) \} \), the acceptance conditions are preserved under the mappings.
+  
+\end{proof}
+
+\begin{definition}[Threat Model]
+A threat model is a tuple \( (P, (Q_i)_{i=0}^m, \phi) \) where:
+\begin{itemize}
+    \item \( P, Q_0, \ldots, Q_m \) are processes.
+    \item Each process \( Q_i \) has no atomic propositions (i.e., its set of atomic propositions is empty).
+    \item \( \varphi \) is an LTL formula such that \( P \parallel Q_0 \parallel \cdots \parallel Q_m \models \phi \).
+    \item The system \( P \parallel Q_0 \parallel \cdots \parallel Q_m \) satisfies the formula \( \phi \) in a non-trivial manner, meaning that \( P \parallel Q_0 \parallel \cdots \parallel Q_m \) has at least one infinite run.
+\end{itemize}
+\end{definition}

 \begin{theorem}
-  Checking whether there exists an attacker under a given threat model, the R-$\exists$ASP problem as proposed in Hippel et al., is equivalent to B\"uchi Automata language inclusion (which is in turn solved by the \spin model checker).
+  Checking whether there exists an attacker under a given threat model, the R-$\exists$ASP problem as proposed in \cite{Hippel2022_anonym}, is equivalent to B\"uchi Automata language inclusion (which is in turn solved by the \spin model checker).
 \end{theorem}

-Via the previous theorem, we can translate the threat model processes and the victim processes to \ba and intersect them. B\"uchi Automata intersection corresponds with \ba language inclusion, which is in turn solved by \spin. From this result, we naturally get a complexity-theoretic result for finding an attacker from a given threat model.
+\begin{proof}
+  For a given threat model \( (P, (Q_i)_{i=0}^m, \phi) \), checking $\exists ASP$ is equivalent to checking
+  \[
+  R = MC(P \mid \mid \text{Daisy}(Q_0) \mid \mid \ldots \mid \mid \text{Daisy}(Q_m), \phi)
+  \] 
+  Where $MC$ is a model checker, and Daisy($Q_i$) is for intents of this proof, equivalent to a process. Therefore, via the previous theorem we can construct \ba \( BA_{P}, BA_{\text{Daisy}(Q_0)}, \ldots, BA_{\text{Daisy}(Q_m)} \) from the processes \( P, \text{Daisy}(Q_0), \ldots ,\text{Daisy}(Q_m) \). Then, we check
+  \[
+    \text{\spin}(BA_{P} \mid \mid BA_{\text{Daisy}(Q_0)} \mid \mid \ldots \mid \mid BA_{\text{Daisy}(Q_m)}, \phi)
+  \] 
+  Or equivalently, translating $\phi$ to the equivalent \ba $BA_{\phi}$ via \cite{Holzmann_1997}, we equivalently check
+  \[
+    \left(BA_{P} \mid \mid BA_{\text{Daisy}(Q_0)} \mid \mid \ldots \mid \mid BA_{\text{Daisy}(Q_m)}\right) \subseteq BA_{\phi}
+  \] 
+\end{proof}

 \begin{theorem}
-Checking whether there exists an attacker for a given threat model, the R-$\exists$ASP problem as proposed in Hippel et al., is PSPACE-complete.
+  Checking whether there exists an attacker for a given threat model, the R-$\exists$ASP problem as proposed in \cite{Hippel2022_anonym}, is PSPACE-complete.
 \end{theorem}

-By the previous argument the attack synthesis problem reduces to intersecting multiple \ba (or alternatively \ba language inclusion), which is well-known to be PSPACE-complete \cite{Kozen_1977}.
-Although this result implies \korg has a rough upper bound complexity, in practice due the various implementation-level optimizations of \spin finding attacks on some property is generally fast, but proving their absence via a state-space search can expensive \cite{Clarke_Wang}.
-
-Since \korg uses \spin as its underlying model checker, we can effectively conclude \korg is sound and complete. 
-
+\begin{proof}
+By the previous argument the $\exists$ASP problem corresponds to \ba language inclusion, which is well-known to be PSPACE-complete \cite{Kozen_1977}.
+\end{proof}