korg-paper/sections/case_studies.tex

%!TEX root = ../main.tex

In this section we describe two case studies: the Transmission Control Protocol (TCP), a data transfer protocol, and Raft, a state machine replication protocol.

\subsection{TCP}%
\label{sub:TCP}

Transmission Control Protocol (TCP) is a transport-layer protocol designed to establish reliable, ordered communications between two peers. TCP is ubiquitous in today's internet, and therefore has seen ample formal verification efforts \cite{Cluzel_Georgiou_Moy_Zeller_2021, Smith_1997, Pacheco2022}, including using \promela and \spin \cite{Pacheco2022}.
%A previous version of \korg has been applied TCP in \cite{Pacheco2022, Hippel2022};
%in particular, we study our \korg extensions using the hand-written TCP \promela model from \cite{Pacheco2022}.
We construct a TCP \promela model referencing the set of TCP RFCs.
For our analysis, we borrow the four LTL properties used in \cite{Pacheco2022}, as detailed below:
%we study our \korg extensions using the \promela models from Pacheco et al., which includes a "gold" model whose underlying state machine is derived via an NLP-based algorithm applied to the SCTP RFC \cite{rfc9260} and a "canonical" model hand-written by domain experts \cite{Pacheco2022}.
\[
\begin{aligned}
\phi_1 &= \text{\parbox[t]{20em}{No half-open connections.}} \\
\phi_2 &= \text{\parbox[t]{20em}{Passive/active establishment eventually succeeds.}} \\
\phi_3 &= \text{\parbox[t]{20em}{Peers don't get stuck.}} \\
\phi_4 &= \text{\parbox[t]{20em}{\texttt{SYN\_RECEIVED} is eventually followed by \texttt{ESTABLISHED}, \texttt{FIN\_WAIT\_1}, or \texttt{CLOSED}.}}
\end{aligned}
\]

We evaluated the TCP \promela model against \korg's drop, replay, and reordering attacker models on a single uni-directional communication channel. The resulting breakdown of attacks discovered is shown in Figure \ref{res:tcp-table}.

%Evaluating the canonical TCP model using \korg led us to identify edge-cases in the connection establishment routine that weren't accounted for, leading us to construct a "revised" TCP model accounting for these missing edge cases.


\begin{figure}[h!]
\centering
\begin{scriptsize}
\begin{tabular}{|c|c|c|c|}
\hline
                & Drop Attacker & Replay Attacker & Reorder Attacker\\\hline
$\phi_1$  &                          &                          &\\
$\phi_2$  &                      x & x                       &  \\
$\phi_3$  &                         &                          &\\
$\phi_4$  &                         &                           &\\
\hline
\end{tabular}
\end{scriptsize}

\label{res:tcp-table}
\caption{Automatically discovered attacks against
%the hand-written TCP model from Pacheco et al.  and our own,
our TCP model for $\phi_1$ through $\phi_4$. "x" indicates an attack was discovered, and no "x" indicates \korg proved the absence of an attack via an exhaustive search. These experiments were ran on a laptop with an eighth generation i7 and 16gb of memory. Full attack traces are available in the artifact.}
\end{figure}

\begin{comment}
\begin{figure}[h!]
\centering
\begin{scriptsize}
\begin{tabular}{|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|@{}c@{}|}
\hline
& \multicolumn{3}{c|}{\footnotesize \raisebox{-0.15ex}{Drop Attacker} } & \multicolumn{3}{c|}{\footnotesize \raisebox{-0.15ex}{Replay Attacker}} & \multicolumn{3}{c|}{\footnotesize \raisebox{-0.15ex}{Reorder Attacker}} \\
\hline
& \: Gold \: & \: Expert \: & \: Revised \: & \: Gold \: & \: Expert \: & \: Revised \: & \: Gold \: & \: Expert \: & \: Revised \: \\
\hline
  $\phi_1$ & \rule{0pt}{8pt} & & & & The resulting breakdown of attacks discovered is shown in Figure \ref{res:tcp-table}.
          & & & & \\
$\phi_2$ & \rule{0pt}{8pt} & x & x & & x & x & & x & \\
$\phi_3$ & \rule{0pt}{8pt} & & & & & & & & \\
$\phi_4$ & \rule{0pt}{8pt} x & & & & & & x & & \\
\hline
\end{tabular}
\end{scriptsize}

\label{res:tcp-table}
\caption{Automatically discovered attacks against the gold, canonical (labeled "expert"), and revised TCP models for $\phi_1$ through $\phi_4$. "x" indicates an attack was discovered, and no "x" indicates \korg proved the absence of an attack via an exhaustive search. Full attack traces are available in the artifact.}
\end{figure}

\end{comment}

\subsection{Raft}%
\label{sub:Raft}
Raft is a consensus algorithm designed to replicate a state machine across distributed peers, and sees broad usage in distributed databases, key-value stores, distributed file systems, distributed load-balancers, and container orchestration. Historically, verification efforts of Raft using both constructive, mechanized proving techniques \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016, Wilcox_Woos_Panchekha_Tatlock_Wang_Ernst_Anderson, Ongaro} and automated verification \cite{Ongaro} have reasoned about the protocol under certain assumptions about the stability of the communication channels. However, no previous approach to Raft verification has reasoned about an coordinated, arbitrary on-channel attacker \textit{external} to the protocol itself. Uniquely, \korg enables us to study Raft in this context.

Referencing the original Raft thesis \cite{Ongaro} and other raft models \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016}, we constructed a \promela model of the Raft protocol. Additionally, we derived and formalized the following properties, which our \promela model satisfies:
\[
\begin{aligned}
\phi_1 &= \text{\parbox[t]{20em}{No two servers can be leaders in the same term.}} \\
\phi_2 &= \text{\parbox[t]{20em}{Entries committed to the log at the same index must be equivalent.}} \\
\phi_3 &= \text{\parbox[t]{20em}{Only leaders may append entires to the log.}} \\
\phi_4 &= \text{\parbox[t]{20em}{If a leader commits at an index, any server that becomes leader afterwards must follow that commit.}} \\
\phi_5 &= \text{\parbox[t]{20em}{If any two servers commit the same log entry, the log entry at the previous index must be equivalent}}
\end{aligned}
\]
We construct our Raft model such that we can model-check an arbitrary number of peers. We also designed our model such that each peer maintains separate channels for receiving AppendEntry requests, AppendEntry responses, RequestVote requests, and RequestVote responses. This gives \korg ample handle to reason about Raft. In particular, we study Raft in the presence of drop and replay attackers on all four aforementioned channel types, attacking both a minority and majority of peers. A breakdown of our findings is shown in Figure \ref{res:raft-table}.

\begin{figure}[h!]
\centering
\begin{scriptsize}
\label{res:raft-table}
\begin{tabular}{|c|c|}
\hline
Scenario & Attack found? \\
\hline
Dropping AppendEntries messages & no \\
Dropping RequestVote messages & no \\
Replaying RequestVote messages & yes ($\phi_1, \phi_4$ violated) \\
Replaying AppendEntry messages & no \\
Dropping RequestVoteResponse messages & no \\
Dropping AppendEntryResponse messages & no \\
\hline
\end{tabular}
\end{scriptsize}
\caption{Breakdown of the attacker scenarios assessed with \korg against our Raft \promela model. In all experiments, Raft was set to five peers and the drop/replay limits of the gadgets \korg synthesized were set to two. We conducted our experiments on a research computing cluster, allocating 250GB of memory to each verification run. The full models and attacker traces are included in the artifact.}
%\caption{Automatically discovered attacks against
%the hand-written TCP model from Pacheco et al.  and our own,
%our TCP model for $\phi_1$ through $\phi_4$. "x" indicates an attack was discovered, and no "x" indicates \korg proved the absence of an attack via an exhaustive search. These experiments were ran on a laptop with an eighth generation i7 and 16gb of memory. Full attack traces are available in the artifact.}
\end{figure}
In our experiments, we found just one attack on our Raft \promela model, violating election safety in particular. In this scenario, peer A and peer B are candidates for election. Peer A receives three votes, one from itself and two from other peers, and Peer B receives two votes, one from itself and one from another peer. The replay attacker simply replays the vote sent to peer B. Then, both Peer A and Peer B are convinced they won the election and change their state to leader. Following this, leader completeness is also naturally violated.

To be clear, this is not an attack on the general Raft protocol, but rather an attack on our specific Raft implementation: in this case, the bug \korg exploits involves our Raft model not ensuring votes received are from unique peers\footnote{Naturally, this requires cryptography and therefore is challenging to express in the semantics of \promela.}. In general, the complete Raft protocol has been proven to resist drop and replay attackers \cite{Woos_Wilcox_Anton_Tatlock_Ernst_Anderson_2016}. In this scenario, \korg demonstrates its ability to discover subtle bugs in protocol logic; our Raft model satisfies $\phi_1$-$\phi_5$ assuming perfect channels, and \korg allowed us to reason precisely about the effect of imperfect, vulnerable channels.

% We note our analysis is in no