\mysection{Introduction}\label{sec:introduction} Concurrent data structure design can improve scalability by supporting multiple simultaneous operations, reducing memory coherence traffic, and reducing the time taken by an individual data structure operation. Researchers have developed many concurrent data structure designs with these goals~\cite{rcu, lockfreequeue}. Concurrent data structures often use sophisticated techniques including low-level atomic instructions (e.g., compare and swap), careful reasoning about the order of loads and stores, and fine-grained locking. For example, while the standard Java hash table implementation can limit scalability to a handful of cores, more sophisticated concurrent hash tables can scale to many hundreds of cores~\cite{javaConcurrentHashMap}. The C/C++ standard committee extended the C and C++ languages with support for low-level atomic operations in the C/C++11 standard~\cite{cpp11spec, c11spec, boehmpldi} to enable developers to write portable implementations of concurrent data structures. To support the relaxations typically performed by compilers and processors, the C/C++ memory model provides weaker semantics than sequential consistency~\cite{scmemorymodel} and as a result, correctly using these operations is challenging. Developers must not only reason about potential interleavings, but also about how the processor and compiler might reorder memory operations. Even experts make subtle errors when reasoning about such memory models. Researchers have developed tools for exploring the behavior of code under the C/C++ memory model including \cdschecker~\cite{cdschecker}, {\sc Cppmem}~\cite{c11popl}, and Relacy~\cite{relacy}. These tools explore behaviors that are allowed under the C/C++ memory model. While these tools can certainly be useful for exploring executions, they can be challenging to use for testing as they don't provide support (other than assertions) for specifying the behavior of data structures. Using assertions can be challenging as different interleavings or reorderings legitimately produce different behaviors, and it can be very difficult to code assertions to check the output of a test case for an arbitrary (unknown) execution. This paper presents \TOOL, a specification language and specification checking tool that is designed to be used in conjunction with model checking tools. We have implemented it as a plugin for the \cdschecker model checker. \mysubsection{Background on Specifying the Correctness of Concurrent Data Structures} Researchers have developed several techniques for specifying correctness properties of concurrent data structures written for strong memory models. While these techniques cannot handle the behaviors typically exhibited by relaxed data structure implementations, they provide insight into intuitive approaches to specifying concurrent data structure behavior. One approach for specifying the correctness of concurrent data structures is in terms of equivalent sequential executions of either the concurrent data structure or a simplified sequential version. The problem then becomes how do we map a concurrent execution to an equivalent sequential execution? A common criterion is {\it linearizability} --- linearizability simply states that a concurrent operation can be viewed as taking effect at some time between its invocation and its return (or response)~\cite{linearizableref}. An {\it equivalent sequential data structure} is a sequential version of a concurrent data structure that can be used to express correctness properties by relating executions of the original concurrent data structure with executions of the equivalent sequential data structure. The equivalent sequential data structure is often simpler, and in many cases one can simply use existing well-tested implementations from the STL library. An execution {\it history} is a total order of method invocations and responses. A {\it sequential history} is one where all invocations are followed by the corresponding responses immediately. A concurrent execution is correct if its behavior is consistent with its equivalent sequential history replayed on the equivalent sequential data structure. A concurrent object is linearizable if for all executions: \squishcount \item Each method call appears to take effect instantaneously at some point between its invocation and response. \item The invocations and responses can be reordered to yield a sequential history under the rule that an invocation cannot be reordered before the preceding responses. \item The concurrent execution yields the same behavior as the sequential history. \countend A weaker variation of linearization is {\it sequential consistency}\footnote{It is important to note that the term sequential consistency in the literature is applied to both the consistency model that data structures expose clients to as well as the guarantees that the underlying memory system provides for load and store operations.}. Sequential consistency only requires that there exists a sequential history that is consistent with the {\it program order} (the intra-thread order). This ordering does not need to be consistent with the order that the operations were actually issued in. Line-Up~\cite{lineup}, Paraglider~\cite{VechevMCLinear}, and VYRD~\cite{vyrd} leverage linearizability to test concurrent data structures. {\bf Unfortunately, efficient implementations of many common data structures, e.g., RCU~\cite{rcu}, MS Queue~\cite{lockfreequeue}, etc., for the C/C++ memory model are neither linearizable nor sequentially consistent! Thus previous tools cannot check such data structures under the C/C++ memory model.} \mysubsection{New Challenges from the C/C++ Memory Model} \label{sec:introNewChallenges} The C/C++ memory model brings the following two key challenges that prevent the application of previous approaches to specifying the concurrent data structures to this setting: \squishcount \item {\bf Relaxed Executions Break Existing Data Structure Consistency Models:} C/C++ data structures often expose clients to weaker (non-SC) behaviors to gain performance. A common guarantee is to provide happens-before synchronization between operations that implement updates and the operations that read those updates. These data structures often do not guarantee that different threads observe updates in the same order --- in other words the data structures may expose clients to weaker consistency models than sequential consistency. For example, even when one uses the relatively strong \code{acquire} and \code{release} memory orderings in C++, it is possible for two different threads to observe two stores happening in different orders, i.e., executions can fail the IRIW test. Thus many data structures legitimately admit executions for which there are no sequential histories that preserve program order. Like many other relaxed memory models, the C/C++ memory model does not include a total order over all memory operations, thus even further complicating the application of traditional approaches to correctness, e.g., linearization cannot be applied. In particular the approaches that relate the behaviors of concurrent data structures to analogous sequential data structures break down due to the absence of a total ordering of the memory operations. While many of the dynamic tools~\cite{cdschecker,relacy} for exploring the behavior of code under relaxed models do as a practical matter print out an execution in some order, this order is to some degree arbitrary as relaxed memory models generally make it possible for a data structure operation to see the effects of operations that appear later in any such an order (e.g., a load can read from a store that appears later in the order). Instead of a total order, the C/C++ memory model is formulated as a graph of memory operations with several partial orders defined in this graph. \item {\bf Constraining Reorderings (Specifying Synchronization Properties):} Synchronization\footnote{Synchronization here is not mutual exclusion, but rather a lower-level property that captures which stores must be visible to a thread. In other words, it constrains which reorderings can be performed by a processor or compiler.} in C/C++ provides an ordering between memory operations to different locations. Concurrent data structures must establish synchronization or they potentially expose their users to highly non-intuitive behavior that is likely to break client code. For example, consider the case of a concurrent queue that does not establish synchronization between enqueue and dequeue operations. Consider the following sequence of operations: (1) thread A initializes the fields of a new object X; (2) thread A enqueues the reference to X in such a queue (3) thread B dequeues the reference to X; (4) thread B reads the fields of X through the dequeued reference. In (4), thread B could fail to see the initializing writes from (1). This surprising behavior could occur if the compiler or CPU could reorder the initializing writes to be executed after the enqueue operation. If the fields are non-atomic, such loads are considered data races and violate the data race free requirement of the C/C++ language standard and thus the program has no semantics. The C/C++ memory model formalizes synchronization in terms of a {\it happens-before} relation. The C/C++ happens-before relationship is a partial order over memory accesses. If memory access {\it x} happens before memory access {\it y}, it means that the effects of {\it x} must be ordered before the effects of {\it y}. \countend \begin{figure} \centering \vspace{-.3cm} \includegraphics[scale=0.35]{figures/specworkflow} \vspace{-.4cm} \caption{\label{fig:specworkflow} \TOOL system overview} \vspace{-.4cm} \end{figure} \mysubsection{Specification Language and Tool Support} Figure~\ref{fig:specworkflow} presents an overview of the \TOOL system. After implementing a concurrent data structure, developers annotate their code with a \TOOL specification. To test their implementation, developers compile the data structure with the \TOOL specification compiler to extract the specification and generate a program that is instrumented with specification checks. Then, developers compile the instrumented program with a standard C/C++ compiler. Finally, developers run the binary under the \TOOL checker. \TOOL then exhaustively explores the behaviors of the specific unit test and generates diagnostic reports for any executions that violate the specification. \mysubsection{Contributions} This paper makes the following contributions: \squishlist \item {\bf Specification Language:} It introduces a specification language that enables developers to write specifications of concurrent data structures developed for a relaxed memory model in a simple fashion that capture the key correctness properties. Our specification language is the first to our knowledge that supports concurrent data structures that use C/C++ atomic operations. \item {\bf A Technique to Relate Concurrent Executions to Sequential Executions:} It presents an approach to order the memory operations for the C/C++ model, which lacks a definition of a trace and for which one generally cannot even construct a total order of atomic operations that is consistent with the program order. The generated sequential execution by necessity does not always maintain program order. \item {\bf Synchronization Properties:} It presents (a) constructs for specifying the happens before relations that a data structure should establish, and (b) tool support for checking these properties and exposing synchronization related bugs. \item {\bf Tool for Checking C/C++ Data Structures Against Specifications:} \TOOL is the first tool to our knowledge that can check concurrent data structures that exhibit relaxed behaviors against specifications that are specified in terms of intuitive sequential executions. \item {\bf Evaluation:} It shows that the \TOOL specification language can express key correctness properties for a set of real-world concurrent data structures, that our tool can detect bugs, and that our tool can unit test real world data structures with reasonable performance. \squishend