notes/nondeterm-spec.txt

   1 Modification to the current specifications.
   2
   3 I. Order
   4 -- Sequential order (SO): Some total order that is consistent with the union of
   5 happens-before and SC relation.
   6
   7 II. State
   8 Previously, we keep a sequential state throughout the process of executing the
   9 sequential history. In our model, we keep a state local to each method call.
  10 Conceptually, this state is the accumulative side effects of the subset of
  11 method calls that happen before the current method call.
  12
  13 To evaluate the state of each method call, an obvious approach is to
  14 execute the subset of methods that happen before the current method in the
  15 sequential order from the initial state. A optimization we can make is that we
  16 start to evaluate the state from the most recent deviding node which every other
  17 node in that subset is either hb before or after.
  18
  19 III. Specifications
  20 Our specification language supports using the following primitives to access the
  21 state of method calls so that users can use those to write specifications with
  22 different level of tightness.
  23
  24 To support tighter specifications, we introduce the concept of concurrent set of
  25 method calls, meaning that for a specific method call, it can basically see the
  26 effect of two different categories of method calls --- one that happens before
  27 it, and one that concurrently execute with it. It is worth noting that when two
  28 two method calls execute concurrently, in general there can be the following two
  29 effects: 1) those concurrent methods can happen in either order, and the final
  30 result remains the same. A concurrent FIFO is an example, in which concurrent
  31 enqueue and dequeue methods can happen in a random order; and 2) the order of
  32 those concurrent methods will affect the final result. The C/C++11 atomics is an
  33 example, in which when concurrent stores to the same location execute in
  34 different order, a later store will have different result.
  35
  36 1. CONCURRENT: This primitive extracts all the methods that executes
  37 "concurrently" with the current method --- neither hb/SC before nor after the
  38 current method --- and returns as a set. It is worth noting that in order to
  39 preserve composability, when evaluating the state, the concurrent method calls'
  40 information cannot be used. That is to say, the concurrent set of mehtods are
  41 only used for assertions.
  42
  43 2. PREV: This primitive extracts all the methods that execute right before the
  44 current method in the execution graph --- most recent method calls that are
  45 hb/SC before the current method call --- and returns as a set. For each method
  46 in this set, the current method's specification can access their state.
  47
  48 3. NEXT: This primitive extracts all the methods that execute right after the
  49 current method in the execution graph, and returns as a set. For each method in
  50 this set, the current method's specification CANNOT access their state (for
  51 preserving composability).
  52
  53 // FIXME: This might break composability!!??!!
  54 4. FINAL: This is the function that allows users to specify some final check
  55 on the state. This will enable to users to use the graph model (the relaxed
  56 atomics can be specified) although the complxity of using that can get quite
  57 complex.
  58
  59 Our specifications allow two ways of evaluating the state of method calls. One
  60 way is to define "@Transition" on method calls, and then our checker executes
  61 the method calls that are hb/SC before me in the sequential order starting from
  62 the initial state. The other way is to define "@EvaluateState" after
  63 "@Transition", which can only access the states of method calls that are hb/SC
  64 before it.  Usually users calculate the state by using the PREV primitive to
  65 access the state of previous method calls.
  66
  67 IV. Examples
  68
  69 /// Global specification
  70 @State: // Declare the state structure
  71 @Initial: // How do we initialize the state
  72 @Commutativity: Method1 <-> Method2 (Guard) // Guard can only access the return
  73                 // value and the arguments of the two method calls
  74
  75 /// Interface specification
  76 @Interface: InterfaceName // Required; a label to represent the interface
  77 @Transition: // Optional; the transitional side effect from the initial state to
  78                          // the current method call by executing such effect on method calls
  79                          // that are hb/SC before me in sequential order. When this field is
  80                          // omitted, the current state seen before checking is the same as
  81                          // the initial state.
  82 @EvaluateState: // Optional; to calculate the state before this
  83                                 // method call is checked in a customized fashion. This is
  84                                 // evaluated after "@Transition". If omitted, the state we would
  85                                 // see is the effect after the "@Transition".
  86 @PreCondition: // Optional; checking code
  87 @SideEffect: // Optional; to calculate the side effect this method call
  88                                   // have on the state after the "@PreCondition".
  89 @PostCondition: // Optional; checking code
  90
  91
  92 /// Convienient operations
  93 We define a struct called MethodCall to represent the data we would collect and
  94 communicate between the real execution and the checking process.
  95
  96
  97 STATE(field)    // A primitive to access the current state in the interface
  98                                 // specification by the name of the field. We can use this as a
  99                                 // lvalue & rvalue ==> "STATE(x) = 1" and "int i = STATE(x)" are
 100                                 // both okay
 101
 102                                 // This can also be used to refer to the state of a method item
 103                                 // in the Subset operation (see Subset)
 104
 105 NAME                    // Used to refer to the name of a method item in the Subset
 106                                 // operation (see Subset)
 107
 108 RET(type)               // Used to refer to the return value of a method item in the
 109                                 // Subset operation (see Subset)
 110
 111 ARG(type, arg)  // Used to refer to the argument of a method item in the
 112                                 // Subset operation (see Subset)
 113
 114
 115 PREV            // Represent the set of previous method calls of the current method
 116                         // call
 117
 118 CONCURRENT      // Represent the set of concurrent method calls of the current
 119                                 // method call
 120
 121 NEXT            // Represent the set of next method calls of the current method
 122                         // call
 123
 124 Name(method)            // The interface label name of the specific method
 125
 126 State(method, field)            // A primitive to access a specific state field of a
 127                                                         // specific method. Also used as lvalue and rvalue
 128
 129 Ret(method, type)               // The return value of the specific method; type should
 130                                                 // be the name of the interface label
 131
 132 Arg(method, type, arg)  // The arguement by name of the specific method; type should
 133                                                 // be the name of the interface label, and arg should be
 134                                                 // the name of the argument
 135
 136 Prev(method)            // Represent the set of previous method calls of the
 137                                         // specific method call
 138
 139 Next(method)            // Represent the set of next method calls of the specific
 140                                         // method call
 141
 142 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 143 ForEach(item, container) { ... }        // Useful iteration primitive
 144
 145 NewMethodSet            // Create a new method set (set<MethodCall*>*)
 146
 147 MakeSet(NewSetType, oldset, newset, field);             //
 148
 149 Subset(set, condition)  // A generic subset operation that takes a condition and
 150                                                 // returns all the item for which the boolean guard
 151                                                 // holds. The condition can be specified with GUARD or
 152                                                 // GeneralGuard shown as follow.
 153
 154 ITEM    // Used to refer to the set item in the GeneralGuard shown below
 155
 156 GeneralGuard(type, expression)  // Used as a condition for any set<T> type. We
 157                                                                 // use "ITEM" to refer to a set item. For
 158                                                                 // example, a subset of positive elements on an
 159                                                                 // IntSet "s" would be
 160                                                                 // "Subset(s, GeneralGuard(int, ITEM > 0))"
 161
 162 Guard(expression)       // Used specifically for MethodSet(set<MethodCall*>). An
 163                                         // example to extract the subset of method calls in the PREV
 164                                         // set that is a "Store" method and whose state "x" is equal
 165                                         // to "val" and the return value is 5 would be
 166                                         // "Subset(PREV, Guard(STATE(x) == val && NAME == "Store" &&
 167                                         // RET(Store) == 5))"
 168
 169 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 170 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 171
 172
 173 To make things easier, we have the following helper macros.
 174 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 175 /**
 176         This is a common macro that is used as a constant for the name of specific
 177         variables. We basically have two usage scenario:
 178         1. In Subset operation, we allow users to specify a condition to extract a
 179         subset. In that condition expression, we provide NAME, RET(type), ARG(type,
 180         field) and LOCAL(field) to access each item's (method call) information.
 181         2. In general specification (in pre- & post- conditions and side effects),
 182         we would automatically generate an assignment that assign the current
 183         MethodCall* pointer to a variable namedd _M. With this, when we access the
 184         local state of the current method, we use LOCAL(field) (this is a reference
 185         for read/write).
 186 */
 187 #define ITEM _M
 188 #define _M ME
 189
 190 #define CAT(a, b) CAT_HELPER(a, b) /* Concatenate two symbols for macros! */
 191 #define CAT_HELPER(a, b) a ## b
 192 #define X(name) CAT(__##name, __LINE__) /* unique variable */
 193
 194 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 195
 196 *****************************************************************************
 197 typedef struct MethodCall {
 198         string name; // The interface label name
 199         void *value; // The pointer that points to the struct that have the return
 200                                  // value and the arguments
 201         void *state; // The pointer that points to the struct that represents
 202                                           // the state
 203         set<MethodCall*> *prev; // Method calls that are hb right before me
 204         set<MethodCall*> *concurrent; // Method calls that are concurrent with me
 205         set<MethodCall*> *next; // Method calls that are hb right after me
 206 } MethodCall;
 207
 208 typedef MethodCall *Method;
 209 typedef set<Method> *MethodSet;
 210
 211 typedef vector<int> IntVector;
 212 typedef list<int> IntList;
 213 typedef set<int> IntSet;
 214 *****************************************************************************
 215
 216 We will automatically generate two types of struct. One is the state struct, and
 217 we will call it StateStruct. The other one is a per interface struct, and it
 218 wraps the return value (RET) and the arguments as its field. We will name those
 219 struct with the interface label name.
 220
 221 -----------------------------------------------------------------------------
 222 For example, if we declare an ordered integer list in the specification state ,
 223 we will generate a struct as follow.
 224
 225 @State: IntList *queue;
 226 @Initial: queue = new IntList;
 227 @...
 228
 229 ===>
 230 typedef struct StateStruct {
 231         IntList *queue;
 232 } StateStruct;
 233
 234 -----------------------------------------------------------------------------
 235 If we have two interface whose specifications are as follow, we would generate
 236 two struct accordingly.
 237
 238 @Interface: Store
 239 @...
 240 void store(atomic_int *loc, int val) ...
 241
 242 @Interface: Load
 243 @...
 244 int load(atomic_int *loc) ...
 245
 246 ===>
 247 typedef struct Store {
 248         atomic_int *loc;
 249         int val;
 250 } Store;
 251
 252 typedef struct Load {
 253         int RET;
 254         atomic_int *loc;
 255 } Load;
 256
 257 -----------------------------------------------------------------------------
 258 We will put all these generated struct in a automatically-generated header file
 259 called "generated.h". This file also includes header files that have commonly
 260 used data types that interface (return value and arguments) accesses. In order
 261 to make things work, users should include "generated.h" file in the end for
 262 using our specification checker.
 263
 264 *****************************************************************************
 265
 266
 267 // Some nice C/C++ macro definition to make specifications easier to write
 268
 269 -----------------------------------------------------------------------------
 270 #define CAT(a, b) CAT_HELPER(a, b) /* Concatenate two symbols for macros! */
 271 #define CAT_HELPER(a, b) a ## b
 272 #define X(name) CAT(__##name, __LINE__) /* unique variable */
 273
 274
 275
 276 -----------------------------------------------------------------------------
 277 1. ForEach  --> to iterate a set, list or vector (containers that support using
 278 iterator to iterate)
 279
 280 #define ForEach(item, container) \
 281         auto X(_container) = (container); \
 282         auto X(iter) = X(_container)->begin(); \
 283         for (auto item = *X(iter); X(iter) != X(_container)->end(); item = ((++X(iter)) != \
 284                 X(_container)->end()) ? *X(iter) : 0)
 285
 286 -----------------------------------------------------------------------------
 287
 288 /**
 289         This is a common macro that is used as a constant for the name of specific
 290         variables. We basically have two usage scenario:
 291         1. In Subset operation, we allow users to specify a condition to extract a
 292         subset. In that condition expression, we provide NAME, RET(type), ARG(type,
 293         field) and LOCAL(field) to access each item's (method call) information.
 294         2. In general specification (in pre- & post- conditions and side effects),
 295         we would automatically generate an assignment that assign the current
 296         MethodCall* pointer to a variable namedd _M. With this, when we access the
 297         local state of the current method, we use LOCAL(field) (this is a reference
 298         for read/write).
 299 */
 300 #define ITEM _M
 301 #define _M ME
 302
 303
 304 -----------------------------------------------------------------------------
 305
 306 // 1.1 Subset(set, guard)  --> to get a subset of method calls by a boolean
 307 // expression; This takes advantage of C++11 std::function features and C macros.
 308
 309 // 1.2 Size(set) --> to get the size of a method set
 310 #define Size(set) set->size()
 311
 312 // 1.3 Belong(set, method) --> whether method belongs to set
 313 #define Belong(set, method) std::find(set->begin(), set->end(), method) != set->end()
 314
 315 // 1.4 Intersect(set1, set2) --> the intersection of two method sets
 316 inline MethodSet Intersect(MethodSet set1, MethodSet set2) {
 317         MethodSet res = NewSet;
 318         ForEach (m, set1) {
 319                 if (Belong(set2, m))
 320                         res->push_back(m);
 321         }
 322         return res;
 323 }
 324
 325 // 1.5 Union(set1, set2) --> the union of two method sets
 326 inline MethodSet Union(MethodSet set1, MethodSet set2) {
 327         MethodSet res = NewSet(set1);
 328         ForEach (m, set2) {
 329                 if (!Belong(set1, m))
 330                         res->push_back(m);
 331         }
 332         return res;
 333 }
 334
 335 // 1.6 Insert(set, method) --> add a method to the set
 336 inline bool Insert(MethodSet set, Method m) {
 337         if (Belong(set, m))
 338                 return false;
 339         else {
 340                 set->push_back(m);
 341                 return true;
 342         }
 343 }
 344
 345 // 1.7 Subtract(set1, set2) --> subtract set2 from set1
 346 inline MethodSet Subtract(MethodSet set1, MethodSet set2) {
 347         MethodSet res = NewSet;
 348         ForEach (m, set1) {
 349                 if (!Belong(set2, m))
 350                         res->push_back(m);
 351         }
 352         return res;
 353 }
 354
 355 // 1.8 MakeSet(count, ...) --> Make a set from the arguments
 356
 357
 358 // 2. Local(method, field)
 359 #define Local(method, field) ((StateStruct*) method->localState)->field
 360
 361 // 3. Value(method, type, field)
 362 #define Value(method, type, field) ((type*) method->value)->field
 363 3.1 Return
 364 #define Ret(method, type) Value(method, type, RET)
 365 3.2 Arguments
 366 #define Arg(method, type, arg) Value(method, type, arg)
 367
 368
 369 // 4. Name(mehtod)
 370 #defien Lable(method) method->interfaceName
 371
 372 // 5. Prev(method)
 373 #define Prev(method) mehtod->prev
 374
 375 // 6. Next(method)
 376 #define Next(method) mehtod->next
 377
 378 // 7. Concurrent(method)
 379 #define Concurrent(method) mehtod->concurrent
 380
 381
 382 /// Ordering point specification
 383 @OPDefine: condition    // If the specified condition satisfied, the atomic
 384                                                 // operation right before is an ordering point
 385
 386 @PotentialOP(Label): condition  // If the specified condition satisfied, the
 387                                                                 // atomic operation right before is a potential
 388                                                                 // ordering point, and we label it with a tag
 389
 390 @OPCheck(Label): condition      // If the specified condition satisfied, the
 391                                                         // potential ordering point defined earlier with the
 392                                                         // same tag becomes an ordering point
 393
 394 @OPClear: condition             // If the specified condition satisfied, all the
 395                                                 // ordering points and potential ordering points will be
 396                                                 // cleared
 397
 398 @OPClearDefine: condition       // If the specified condition satisfied, all the
 399                                                         // ordering points and potential ordering points will
 400                                                         // be cleared, and the atomic operation right before
 401                                                         // becomes an ordering point. This is a syntax sugar
 402                                                         // as the combination of an "OPClear" and "OPDefine"
 403                                                         // statement
 404
 405
 406 1. The register examples: Basically, we can think of registers as the cache on a
 407 memory system. The effect of a load or store basically tells us what the current
 408 value in the cache line is, and a load can read from values that can be
 409 potentially in the cache --- either one of the concurrent store update the cache
 410 or it inherites one of the the previous local state in the execution graph.
 411
 412 ----------   Interfae   ----------
 413 void init(atomic_int &loc, int initial);
 414 int load(atomic_int &loc);
 415 void store(atomic_int &loc, int val);
 416 ----------   Interfae   ----------
 417
 418 a. The SC atomics --- the classic linearizability approach
 419
 420 b. The RA (release/acquire) C/C++ atomics
 421 // For RA atomics, a load must read its value from a store that happens before
 422 // it.
 423 ----------   Specification   ----------
 424 @DeclareVar: int x;
 425 @InitVar: x = 0;
 426
 427 @Interface: Store
 428 @SideEffect: LOCAL(x) = val;
 429 void store(int *loc, int val);
 430
 431 @Interface: Load
 432 @PreCondition:
 433         Size(Subset(PREV, LOCAL(x) == RET)) > 0;
 434 @SideEffect: LOCAL(x) = RET;
 435 int load(int *loc);
 436
 437 c. The C/C++ atomics (a slightly loose specification)
 438 // Actually, any concurrent data structures that rely modification-order to be
 439 // correct would not have a precicely tight specification under our model, and
 440 // C/C++ relaxed atomics is an example. See the following read-read coherence
 441 // example.
 442
 443 // T1                           // T2
 444 x = 1;                          x = 2;
 445
 446 // T3
 447 r1 = x; // r1 == 1
 448 r2 = x; // r2 == 2
 449 r3 = x; // r3 == 1
 450
 451 Our model cannot prevent such a case from happening. However, we can still have
 452 a slightly loose specification which basically states that a load can read from
 453 any store that either immediately happens before it or concurrently executes.
 454
 455
 456 ----------   Specification   ----------
 457 @DeclareVar: int x;
 458 @InitVar: x = 0;
 459
 460 @Interface: Store
 461 @SideEffect: LOCAL(x) = val;
 462 void store(int *loc, int val);
 463
 464
 465 @Interface: Load
 466 @PreCondition:
 467         // Auto generated code
 468         // MethodCall *ME = ((SomeTypeConversion) info)->method;
 469
 470         int count = Size(Subset(Prev, LOCAL(x) == RET))
 471                 + Size(Subset(CONCURRENT, NAME == "Store" && ARG(Store, val) == RET))
 472         return count > 0;
 473 @SideEffect: LOCAL(x) = RET;
 474 int load(int *loc);
 475
 476 d. The C/C++ normal memory accesses
 477 - Use the admissibility requirement, then the classic linearizability approach
 478 on the admissible executions
 479
 480 2. The FIFO queue example.
 481 ----------   Specification   ----------
 482 // A FIFO queue should have the following properties held:
 483 // 1. The enq() methods should conflict
 484 // 2. The deq() methods that succeed should conflict
 485 // 3. Corresponding enq() and deq() methods should conflict
 486 // 4. An enqueued item can be dequeued by at most once
 487 // 5. A dequeued item must have a corresponding enqueued item
 488 // 6. When a queue is NOT "empty" (users can tightly or loosely define
 489 // emptiness), and there comes a deq() method, the deq() method should succeed
 490
 491
 492 @DeclareVar: vector<int> *q;
 493 @InitVar: q = new voctor<int>;
 494 @Copy: New.q = new vector<int>(Old.q);
 495 // Fails to dequeue
 496 @Commutativity: Deq <-> Deq (!_M1.RET || !_M2.RET)
 497 // The dequeuer doesn't dequeue from that enqueuer
 498 @Commutativity: Enq <-> Deq (!_M2.RET || (_M2.RET && Enq.val != *Deq.res))
 499
 500 @Interface: Enq
 501 @SideEffect: q->push_back(val);
 502 void enq(queue_t *q, int val);
 503
 504 @Interface: Deq
 505 @PreCondition:
 506         // Check whether the queue is really empty
 507         // Either the local state is an empty queue, or for all those remaining
 508         // elements in the local queue, there should be some concurrent dequeuers to
 509         // dequeue them
 510         if (!RET) {
 511                 // Local state is empty
 512                 if (Local(q)->size() == 0) return true;
 513                 // Otherwise check there must be other concurrent dequeuers
 514                 ForEach (item, Local(q)) {
 515                         // Check there's some concurrent dequeuer for this item
 516                         if (Size(Subset(CONCURRENT, NAME == "Deq" && RET(Deq) &&
 517                                 *ARG(Deq, res) == item)) == 0) return false;
 518                 }
 519                 return true;
 520         } else { // Check the global queue state
 521                 return q->back() == *res;
 522         }
 523 @SideEffect:
 524         if (RET) q->pop_back();
 525 bool deq(queue_t *q, int *res);
 526
 527
 528
 529 A good example to simulate a queue data structure is as follows.
 530 Suppose we have a special structure
 531 typedef struct Q {
 532         atomic_int x;
 533         atomic_int y;
 534 } Q;
 535
 536 , and we have two interface on Q, read() and write(), where the write and read
 537 method calls are synchronized by themselves, and they have to read and write the
 538 x and y fields in turn.
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558 ----------------------------------------------------------------------------------------
 559 We also need to think about the non-ordered queue.
 560
 561 ####
 562 Combiming Data Structures ---
 563 For example, a queue, a -hb->c, b -hb-> e.
 564
 565 // T1
 566 enq(1) -> {1} - {$}    // a
 567 enq(2) -> {1, 2} - {$}   // b
 568
 569 // T2
 570 deq(1) -> {$} - {1}   // c
 571 deq($) -> {$} - {1}   // d
 572
 573 // State before this method
 574 JOIN ({1, 2} - {$}, {$} - {1}) = {2} - {1}
 575 deq(2) -> {$} - {1, 2}
 576
 577
 578
 579
 580