more

2024-12-03 17:35:01 -05:00
parent b636781367
commit 673782c888
17 changed files with 2417 additions and 2175 deletions
--- a/sections/case_studies.tex
+++ b/sections/case_studies.tex
@@ -27,6 +27,7 @@ We evaluated the TCP \promela model against \korg's drop, replay, and reordering

 \begin{figure}[h!]
 \centering
+\label{res:tcp-table}
 \begin{scriptsize}
 \begin{tabular}{|c|c|c|c|}
 \hline
@@ -39,10 +40,10 @@ $\phi_4$  &                         &                           &\\
 \end{tabular}
 \end{scriptsize}

-\label{res:tcp-table}
 \caption{Automatically discovered attacks against 
 %the hand-written TCP model from Pacheco et al.  and our own, 
 our TCP model for $\phi_1$ through $\phi_4$. "x" indicates an attack was discovered, and no "x" indicates \korg proved the absence of an attack via an exhaustive search. These experiments were ran on a laptop with an eighth generation i7 and 16gb of memory. Full attack traces are available in the artifact.}
+\label{res:tcp-table}
 \end{figure}

 \begin{comment}
@@ -72,7 +73,7 @@ $\phi_4$ & \rule{0pt}{8pt} x & & & & & & x & & \\

 \subsection{Raft}%
 \label{sub:Raft}
-Raft is a consensus algorithm designed to replicate a state machine across distributed peers, and sees broad usage in distributed databases, key-value stores, distributed file systems, distributed load-balancers, and container orchestration. Historically, verification efforts of Raft using both constructive, mechanized proving techniques \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016, Wilcox_Woos_Panchekha_Tatlock_Wang_Ernst_Anderson, Ongaro} and automated verification \cite{Ongaro} have reasoned about the protocol under certain assumptions about the stability of the communication channels. However, no previous approach to Raft verification has reasoned about an coordinated, arbitrary on-channel attacker \textit{external} to the protocol itself. Uniquely, \korg enables us to study Raft in this context.
+Raft is a consensus algorithm designed to replicate a state machine across distributed peers, and sees broad usage in distributed databases, key-value stores, distributed file systems, distributed load-balancers, and container orchestration. Historically, verification efforts of Raft using both constructive, mechanized proving techniques \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016, Wilcox_Woos_Panchekha_Tatlock_Wang_Ernst_Anderson, Ongaro} and automated verification \cite{Ongaro} have reasoned about the protocol under certain assumptions about the stability of the communication channels. Previously, Raft has been proven to maintain properties of interest with respect volatile, attacker-controlled channels constructively using Rocq\footnote{Previously known as Coq} \cite{Wilcox_Woos_Panchekha_Tatlock_Wang_Ernst_Anderson}. However, no previous approach to Raft verification has reasoned explicitly about a coordinated, arbitrary on-channel attacker \textit{external} to the protocol itself. Uniquely, \korg enables us to study Raft in this context.

 Referencing the original Raft thesis \cite{Ongaro} and other raft models \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016}, we constructed a \promela model of the Raft protocol. Additionally, we derived and formalized the following properties, which our \promela model satisfies:
 \[
@@ -86,7 +87,7 @@ Referencing the original Raft thesis \cite{Ongaro} and other raft models \cite{W
 \]
 We construct our Raft model such that we can model-check an arbitrary number of peers. We also designed our model such that each peer maintains separate channels for receiving AppendEntry requests, AppendEntry responses, RequestVote requests, and RequestVote responses. This gives \korg ample handle to reason about Raft. In particular, we study Raft in the presence of drop and replay attackers on all four aforementioned channel types, attacking both a minority and majority of peers. 

-To test \korg, we introduce a subtle bug in the Raft consensus mechanism: not ensuring votes come from unique peers. A breakdown of our findings is shown in Figure \ref{res:raft_table}.
+To test \korg, we altered our original Raft model to introduce a subtle bug in the raft consensus mechanism by not ensuring votes come from unique peers. We'll refer to our original, correct Raft model as \texttt{raft.pml}, and our buggy Raft model as \texttt{raft-bug.pml}. Both \texttt{raft.pml} and \texttt{raft-bug.pml} passed on $\phi_1$-$\phi_5$ (that is, assuming the channels are perfect). We assess \texttt{raft-bug.pml} with \korg, and a breakdown of our findings is shown in Figure \ref{res:raft_table}.

 \begin{figure}[h!]
 \label{res:raft_table}
@@ -105,10 +106,12 @@ Dropping AppendEntryResponse messages & no \\
 \hline
 \end{tabular}
 \end{scriptsize}
-\caption{Breakdown of the attacker scenarios assessed with \korg against our Raft \promela model. In all experiments, Raft was set to five peers and the drop/replay limits of the gadgets \korg synthesized were set to two. We conducted our experiments on a research computing cluster, allocating 250GB of memory to each verification run. The full models and attacker traces are included in the artifact.}
+\caption{Breakdown of the attacker scenarios assessed with \korg against our buggy Raft \promela model, \texttt{raft-bug.pml}. In all experiments, the Raft model was set to five peers and the drop/replay limits of the gadgets \korg synthesized were set to two. We conducted our experiments on a research computing cluster, allocating 250GB of memory to each verification run. The full models and attacker traces are included in the artifact.}
 \label{res:raft_table}
 \end{figure}
-In our experiments, we found just one attack on our Raft \promela model, violating election safety in particular. In this scenario, peer A and peer B are candidates for election. Peer A receives three votes, one from itself and two from other peers, and Peer B receives two votes, one from itself and one from another peer. The replay attacker simply replays the vote sent to peer B. Then, both Peer A and Peer B are convinced they won the election and change their state to leader. Following this, leader completeness is also naturally violated. In this scenario, \korg demonstrates its ability to discover subtle bugs in protocol logic; our Raft model satisfies $\phi_1$-$\phi_5$ assuming perfect channels, and \korg allowed us to reason precisely about the effect of imperfect, vulnerable channels.
+In our experiments, we found just one attack on our \texttt{raft-bug.pml} \promela model, violating election safety in particular. In this scenario, peer A and peer B are candidates for election. Peer A receives three votes, one from itself and two from other peers, and Peer B receives two votes, one from itself and one from another peer. The replay attacker simply replays the vote sent to peer B. Then, both Peer A and Peer B are convinced they won the election and change their state to leader. Following this, leader completeness is also naturally violated. In this scenario, \korg demonstrates its ability to discover subtle bugs in protocol logic, exploiting the buggy Raft implementation. 
+
+%our Raft model satisfies $\phi_1$-$\phi_5$ assuming perfect channels, and \korg allowed us to reason precisely about the effect of imperfect, vulnerable channels.


 %To be clear, this is not an attack on the general Raft protocol, but rather an attack on our specific Raft implementation: in this case, the bug \korg exploits involves our Raft model not ensuring votes received are from unique peers\footnote{Naturally, this requires cryptography and therefore is challenging to express in the semantics of \promela.}. In general, the complete Raft protocol has been proven to resist drop and replay attackers \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016}. 
--- a/sections/design.tex
+++ b/sections/design.tex
@@ -19,18 +19,21 @@ As aforementioned, \korg is based on \textit{LTL attack synthesis}; in particula

 %The methodology behind the construction of \korg is based on \textit{LTL attack synthesis}. 

-\korg is designed to target user-specified communication channels in programs written in formal models. The user inputs a formal model of choice, their desired communication channels to attack, the attacker model of choice, and the correctness property of choice. \korg then invokes the model checker, which exhaustively searches for attacks with respect to the chosen attacker model, formal protocol model, and the correctness property.
+\korg is designed to attack user-specified communication channels in state machine-based formal models of distributed protocols. To use \korg, the user inputs a formal model of a distributed protocol in the \promela language, the communication channel(s) the in the formal model they wish to attack, the desired attacker model, and a formalized correctness property for the formal model. The formal model should satisfy the correctness property in absence of \korg. 
+
+Once \korg is invoked, it will modify the user-inputted \promela model such that it integrates the desired attacker model. Then, \korg passes the updated \promela model to the model checker, which performs the exhaustive search or provides an explicit counterexample.
+%programs written in formal models. The user inputs a formal model of choice, their desired communication channels to attack, the attacker model of choice, and the correctness property of choice. \korg then invokes the model checker, which exhaustively searches for attacks with respect to the chosen attacker model, formal protocol model, and the correctness property.

 %\promela, the modeling language of the \spin model checker. The user inputs a \promela model, 
 %their desired communication channels to attack, the attacker model of choice, and the LTL correctness property of choice. \korg then invokes \spin, which exhaustively searches for attacks with respect to the chosen attacker model, \promela model, and correctness property. 
-A high-level overview of the \korg pipeline is given in the Figure \ref{fig:korg_workflow}.
+A high-level visual overview of the \korg pipeline is given in the Figure \ref{fig:korg_workflow}.

-\begin{figure*}[h]
+\begin{figure}[h]
    \centering
-    \includegraphics[width=0.7\textwidth]{assets/diagram-anon.png}
+    \includegraphics[width=0.5\textwidth]{assets/diagram-anon.png}
    \caption{A high-level overview of the \korg workflow}
    \label{fig:korg_workflow}
-\end{figure*}
+\end{figure}


 \subsection{Supported Attacker Models}%
@@ -53,14 +56,154 @@ A high-level overview of the \korg pipeline is given in the Figure \ref{fig:korg
 The most simple attacker model \korg supports is an attacker that can \textit{drop} messages from a channel. The user specifies a "drop limit" value that limits the number of packets the attacker can drop from the channel. Note, a higher drop limit will increase the search space of possible attacks, thereby increasing execution time.
 The dropper attacker model gadget \korg synthesizes works as follows. The gadget will nondeterministically choose to observe a message on a channel. Then, if the drop limit variable is not zero, it will consume the message. An example is shown in Figure \ref{lst:korg_drop}.

+\begin{figure}[h]
+\begin{lstlisting}[caption={Example dropping attacker model gadget with drop limit of 3, targetting channel "cn"}, label={lst:korg_drop}]
+chan cn = [8] of { int, int, int }; 
+
+active proctype attacker_drop() {
+int b_0, b_1, b_2;
+byte lim = 3; // drop limit
+MAIN:
+  do
+  :: cn ? [b_0, b_1, b_2] -> atomic {
+    if
+    :: lim == 0 -> goto BREAK;
+    :: else ->
+       cn ? b_0, b_1, b_2; // consume message on the channel
+       lim = lim - 1;
+       goto MAIN;
+    fi
+    }
+  od
+BREAK:
+}
+\end{lstlisting}
+\end{figure}
+
 \textbf{Replay Attacker Model Gadget} 
 The next attacker model \korg supports is an attacker that can observe and \textit{replay} messages back onto a channel. Similarly to the drop limit for the dropping attacker model, the user can specify a "replay limit" that caps the number of observed messages the attacker can replay back onto the specified channel. 
 The replay attacker model gadget \korg employs works as follows. The gadget has two states, \textsc{Consume} and \textsc{Replay}. The gadget starts in the \textsc{Consume} state and nondeterministically reads (but not consumes) messages on the target channel, sending them into a local storage buffer. Once the gadget read the number of messages on the channel equivalent to the defined replay limit, its state changes to \textsc{Replay}. In the \textsc{Replay} state, the gadget nondeterministically selects messages from its storage buffer to replay onto the channel until out of messages. An example is shown in Figure \ref{lst:korg_replay}.

+\begin{figure}[h]
+\begin{lstlisting}[caption={Example replay attacker model gadget with the selected replay limit as 3, targetting channel "cn"}, label={lst:korg_replay}]
+chan cn = [8] of { int, int, int }; 
+
+// local memory for the gadget
+chan gadget_mem = [3] of { int, int, int };
+
+active proctype attacker_replay() {
+int b_0, b_1, b_2; int i = 3;
+CONSUME:
+  do
+  // read messages until the limit is passed
+  :: cn ? [b_0, b_1, b_2] -> atomic {
+   cn ? <b_0, b_1, b_2> -> gadget_mem ! b_0, b_1, b_2;
+    i--;
+    if
+    :: i == 0 -> goto REPLAY;
+    :: i != 0 -> goto CONSUME;
+    fi }
+  od
+REPLAY:
+  do
+  :: atomic {
+    // nondeterministically select a random value from the storage buffer
+    int am;
+    select(am : 0 .. len(gadget_mem)-1);
+    do
+    :: am != 0 ->
+      am = am-1;
+      gadget_mem ? b_0, b_1, b_2 -> gadget_mem ! b_0, b_1, b_2;
+    :: am == 0 ->
+      gadget_mem ? b_0, b_1, b_2 -> cn ! b_0, b_1, b_2;
+      break;
+    od }
+  // doesn't need to use all messages on the channel
+  :: atomic {gadget_mem ? b_0, b_1, b_2; }
+  // once mem has no more messages, we're done
+  :: empty(gadget_mem) -> goto BREAK;
+  od
+BREAK:
+}
+\end{lstlisting}
+\end{figure}
+
+
 \textbf{Reorder Attacker Model Gadget} 
 \korg supports synthesizing attackers that can \textit{reorder} messages on a channel. Like the drop and replay attacker model gadgets, the user can specify a "reordering limit" that caps the number of messages that can be reordered by the attacker on the specified channel.
 The reordering attacker model gadget \korg synthesizes works as follows. The gadget has three states, \textsc{Init}, \textsc{Consume}, and \textsc{Replay}. The gadget begins in the \textsc{Init} state, where it arbitrarily chooses a message to start consuming by transitioning to the \textsc{Consume} state. When in the \textsc{Consume} state, the gadget consumes all messages that appear on the channel, filling up a local buffer, until hitting the defined reordering limit. Once this limit is hit, the gadget transitions into the \textsc{Replay} state. In the \textsc{Replay} state, the gadget nondeterministically selects messages from its storage buffer to replay onto the channel until out of messages. An example is shown in Figure \ref{lst:korg_reordering}.

+\begin{figure}[h]
+\begin{lstlisting}[caption={Example reordering attacker model gadget with the selected replay limit as 3, targetting channel "cn"}, label={lst:korg_reordering}]
+chan cn = [8] of { int, int, int }; 
+
+chan gadget_mem = [3] of { int, int, int };
+active proctype attacker_reordering() priority 255 {
+byte b_0, b_1, b_2, blocker; int i = 3;
+INIT:
+do
+  :: {  // arbitrarily choose a message to start consuming on
+      blocker = len(cn);
+      do :: b != len(c) -> goto INIT; od
+    }
+  :: goto CONSUME;
+od
+CONSUME:
+do
+  // consume messages with high priority
+  :: c ? [b_0] -> atomic {
+    c ? b_0 -> gadget_mem ! b_0; i--;
+    if 
+    :: i == 0 -> goto REPLAY;
+    :: i != 0 -> goto CONSUME;
+    fi }
+od
+REPLAY:
+  do
+  // replay messages back onto the channel, also with priority
+  :: atomic {
+    int am;
+    select(am : 0 .. len(gadget_mem)-1);
+    do
+    :: am != 0 ->
+      am = am-1;
+      gadget_mem ? b_0 -> attacker_mem_0 ! b_0;
+    :: am == 0 ->
+      gadget_mem ? b_0 -> c ! b_0;
+      break;
+    od }
+  :: atomic { empty(gadget_mem) -> goto BREAK; }
+  od
+BREAK:
+}
+\end{lstlisting}
+\end{figure}
+
+\begin{figure}[h]
+\begin{lstlisting}[caption={Example I/O file targetting channel "cn"}, label={lst:io-file}]
+cn:
+	I:
+	O:1-1-1, 1-2-3, 3-4-5
+\end{lstlisting}
+
+\begin{lstlisting}[caption={Example gadget synthesized from an I/O file targetting the channel "cn"}, label={lst:io-file-synth}]
+chan cn = [8] of { int, int, int };
+
+active proctype daisy() {
+INIT:
+  do
+  :: cn ! 1,1,1;
+  :: cn ! 1,2,3;
+  :: cn ! 3,4,5;
+  :: goto RECOVERY;
+  od
+RECOVERY:
+}
+\end{lstlisting}
+\end{figure}
+
+
+
 \textbf{Insert Attacker Models} 
 \korg supports the synthesis of attackers that can simply insert messages onto a channel. While the drop, replay, and reordering attacker model gadgets as previously described have complex gadgets that \korg synthesizes with respect to a user-specified channel, the insert attacker model gadget is synthesized with respect to a user-defined \textit{IO-file}. This file denotes the specific outputs and channels the attacker is capable of sending, and \korg generates a gadget capable of synthesizing attacks using the given inputs. An example I/O file is given in Figure \ref{lst:io-file}, and the generated gadget is given in Figure \ref{lst:io-file-synth}.

--- a/sections/examples.tex
+++ b/sections/examples.tex
@@ -1,151 +1,8 @@
 %\section{Attacker Model Gadget Examples}%
 %\label{sub:Attacker Model Gadget Examples}

-\begin{figure}[h]
-\begin{lstlisting}[caption={Example dropping attacker model gadget with drop limit of 3, targetting channel "cn"}, label={lst:korg_drop}]
-chan cn = [8] of { int, int, int }; 

-active proctype attacker_drop() {
-int b_0, b_1, b_2;
-byte lim = 3; // drop limit
-MAIN:
-  do
-  :: cn ? [b_0, b_1, b_2] -> atomic {
-    if
-    :: lim == 0 -> goto BREAK;
-    :: else ->
-       cn ? b_0, b_1, b_2; // consume message on the channel
-       lim = lim - 1;
-       goto MAIN;
-    fi
-    }
-  od
-BREAK:
-}
-\end{lstlisting}
-\end{figure}

-\begin{figure}[h]
-\begin{lstlisting}[caption={Example replay attacker model gadget with the selected replay limit as 3, targetting channel "cn"}, label={lst:korg_replay}]
-chan cn = [8] of { int, int, int }; 

-// local memory for the gadget
-chan gadget_mem = [3] of { int, int, int };

-active proctype attacker_replay() {
-int b_0, b_1, b_2;
-int i = 3;
-CONSUME:
-  do
-  // read messages until the limit is passed
-  :: cn ? [b_0, b_1, b_2] -> atomic {
-   cn ? <b_0, b_1, b_2> -> gadget_mem ! b_0, b_1, b_2;
-    i--;
-    if
-    :: i == 0 -> goto REPLAY;
-    :: i != 0 -> goto CONSUME;
-    fi
-    }
-  od
-REPLAY:
-  do
-  :: atomic {
-    // nondeterministically select a random value from the storage buffer
-    int am;
-    select(am : 0 .. len(gadget_mem)-1);
-    do
-    :: am != 0 ->
-      am = am-1;
-      gadget_mem ? b_0, b_1, b_2 -> gadget_mem ! b_0, b_1, b_2;
-    :: am == 0 ->
-      gadget_mem ? b_0, b_1, b_2 -> cn ! b_0, b_1, b_2;
-      break;
-    od
-    }
-  // doesn't need to use all messages on the channel
-  :: atomic {gadget_mem ? b_0, b_1, b_2; }
-  // once mem has no more messages, we're done
-  :: empty(gadget_mem) -> goto BREAK;
-  od
-BREAK:
-}
-\end{lstlisting}
-\end{figure}
-
-\begin{figure}[h]
-\begin{lstlisting}[caption={Example reordering attacker model gadget with the selected replay limit as 3, targetting channel "cn"}, label={lst:korg_reordering}]
-chan cn = [8] of { int, int, int }; 
-
-chan gadget_mem = [3] of { int, int, int };
-active proctype attacker_reordering() priority 255 {
-byte b_0, b_1, b_2, blocker;
-int i = 3;
-INIT:
-do
-  // arbitrarily choose a message to start consuming on
-  :: {
-      blocker = len(cn);
-      do
-      :: b != len(c) -> goto INIT;
-      od
-    }
-  :: goto CONSUME;
-od
-CONSUME:
-do
-  // consume messages with high priority
-  :: c ? [b_0] -> atomic {
-    c ? b_0 -> gadget_mem ! b_0;
-    i--;
-    if 
-    :: i == 0 -> goto REPLAY;
-    :: i != 0 -> goto CONSUME;
-    fi
-  }
-od
-REPLAY:
-  do
-  // replay messages back onto the channel, also with priority
-  :: atomic {
-    int am;
-    select(am : 0 .. len(gadget_mem)-1);
-    do
-    :: am != 0 ->
-      am = am-1;
-      gadget_mem ? b_0 -> attacker_mem_0 ! b_0;
-    :: am == 0 ->
-      gadget_mem ? b_0 -> c ! b_0;
-      break;
-    od
-    }
-  :: atomic { empty(gadget_mem) -> goto BREAK; }
-  od
-BREAK:
-}
-
-\end{lstlisting}
-\end{figure}
-
-\begin{figure}[h]
-\begin{lstlisting}[caption={Example I/O file targetting channel "cn"}, label={lst:io-file}]
-cn:
-	I:
-	O:1-1-1, 1-2-3, 3-4-5
-\end{lstlisting}
-
-\begin{lstlisting}[caption={Example gadget synthesized from an I/O file targetting the channel "cn"}, label={lst:io-file-synth}]
-chan cn = [8] of { int, int, int };
-
-active proctype daisy() {
-INIT:
-  do
-  :: cn ! 1,1,1;
-  :: cn ! 1,2,3;
-  :: cn ! 3,4,5;
-  :: goto RECOVERY;
-  od
-RECOVERY:
-}
-\end{lstlisting}
-\end{figure}

--- a/sections/proofs.tex
+++ b/sections/proofs.tex
@@ -74,8 +74,7 @@ In both structures, a run is an infinite sequence of states connected by transit
 \end{itemize}

 An accepting run in the \ba visits states in \( F \) infinitely often. Similarly, an accepting run in the Process visits states labeled with \( p \) infinitely often. Since \( F = \{ s \in S \mid p \in L(s) \} \), the acceptance conditions are preserved under the mappings.
-  
-\end{proof}
+  \end{proof}

 \begin{definition}[Threat Model]
 A threat model is a tuple \( (P, (Q_i)_{i=0}^m, \phi) \) where:
@@ -104,10 +103,11 @@ A threat model is a tuple \( (P, (Q_i)_{i=0}^m, \phi) \) where:
  \[
    \left(BA_{P} \mid \mid BA_{\text{Daisy}(Q_0)} \mid \mid \ldots \mid \mid BA_{\text{Daisy}(Q_m)}\right) \subseteq BA_{\phi}
  \] 
+Where rendezvous composition for I/O \ba is precise the same as for I/O Kripke Automata; that is, input and output transitions are matched. It's easy to see these composition operations are equivalent.
 \end{proof}

 \begin{theorem}
-  Checking whether there exists an attacker for a given threat model, the R-$\exists$ASP problem as proposed in \cite{Hippel2022_anonym}, is PSPACE-complete.
+  Checking whether there exists an attacker for a given threat model, the R-$\exists$ASP problem as proposed in \cite{Hippel2022_anonym}, is in PSPACE.
 \end{theorem}

 \begin{proof}
--- a/sections/related_work.tex
+++ b/sections/related_work.tex
@@ -0,0 +1,3 @@
+\textbf{Similar Tools}. Several formal methods tools reason about attackers on secure protocols, primarily in the cryptographic context: ProVerif, VerifPal, Tamarin, and Scyther are \textit{Symbolic} and abstract away cryptographic primitives as terms \cite{Kobeissi_Nicolas_Tiwari, Proverif, Tamarin, Cremers}, while CryptoVerif and EasyCrypt are \textit{computational} and reason about game-based cryptographic security proofs \cite{Blanchet_Jacomme, Pereira}. For a general overview, see \cite{ParnoSOK, Basin_Cremers_Meadows_2018}. Before \korg, model checker-based approaches for reasoning about secure protocols have typically employed \spin or TLA+ and only reasoned about correctness \cite{Khan_Mukund_Suresh_2005, Clarke_Wang, wayne_adversaries, Narayana_Chen_Zhao_Chen_Fu_Zhou_2006, Delzanno_Tatarek_Traverso_2014}. 
+
+\textbf{Reasoning About Channels}. There is a long history of using formal methods tools ad-hoc to reason about on-channel attackers, particularly in the context of Byzantine protocols \cite{Wilcox_Woos_Panchekha_Tatlock_Wang_Ernst_Anderson, Castro_Liskov_2002, Delzanno_Tatarek_Traverso_2014}. Formal methods tools have also been applied to reason about message tampering \cite{Henda}, delays \cite{Ginesin}, and congestion control \cite{TCPwn}.