folly/docs/Synchronized.md

   1 `folly/Synchronized.h`
   2 ----------------------
   3
   4 `folly/Synchronized.h` introduces a simple abstraction for mutex-
   5 based concurrency. It replaces convoluted, unwieldy, and just
   6 plain wrong code with simple constructs that are easy to get
   7 right and difficult to get wrong.
   8
   9 ### Motivation
  10
  11 Many of our multithreaded Thrift services (not to mention general
  12 concurrent C++ code) use shared data structures associated with
  13 locks. This follows the time-honored adage of mutex-based
  14 concurrency control "associate mutexes with data, not code".
  15 Examples are abundant and easy to find. For example:
  16
  17 ``` Cpp
  18
  19     class AdPublisherHandler : public AdPopulatorIf,
  20                                public fb303::FacebookBase,
  21                                public ZkBaseApplication {
  22       ...
  23       OnDemandUpdateIdMap adsToBeUpdated_;
  24       ReadWriteMutex adsToBeUpdatedLock_;
  25
  26       OnDemandUpdateIdMap limitsToBeUpdated_;
  27       ReadWriteMutex limitsToBeUpdatedLock_;
  28
  29       OnDemandUpdateIdMap campaignsToBeUpdated_;
  30       ReadWriteMutex campaignsToBeUpdatedLock_;
  31       ...
  32     };
  33 ```
  34
  35 Whenever the code needs to read or write some of the protected
  36 data, it acquires the mutex for reading or for reading and
  37 writing. For example:
  38
  39 ``` Cpp
  40     void AdPublisherHandler::requestUpdateAdId(const int64_t adId,
  41                                                const int32_t dbId) {
  42       checkDbHandlingStatus(dbId);
  43       RWGuard g(adsToBeUpdatedLock_, RW_WRITE);
  44       adsToBeUpdated_[dbId][adId] = 1;
  45       adPublisherMonitor_->addStatValue("request_adId_update", 1, dbId);
  46       LOG(INFO) << "received request to update ad id " << adId;
  47     }
  48 ```
  49
  50 The pattern is an absolute classic and present everywhere.
  51 However, it is inefficient, makes incorrect code easy to
  52 write, is prone to deadlocking, and is bulkier than it could
  53 otherwise be. To expand:
  54
  55 * In the code above, for example, the critical section is only
  56   the line right after `RWGuard`'s definition; it is frivolous
  57   that everything else (including a splurging `LOG(INFO)`) keeps
  58   the lock acquired for no good reason. This is because the
  59   locked regions are not visible; the guard's construction
  60   introduces a critical section as long as the remainder of the
  61   current scope.
  62 * The correctness of the technique is entirely predicated on
  63   convention. There is no ostensible error for code that:
  64
  65     * manipulates a piece of data without acquiring its lock first
  66     * acquires a different lock instead of the intended one
  67     * acquires a lock in read mode but modifies the guarded data structure
  68     * acquires a lock in read-write mode although it only has `const`
  69       access to the guarded data
  70     * acquires one lock when another lock is already held, which may
  71       lead to deadlocks if another thread acquires locks in the
  72       inverse order
  73
  74 ### Introduction to `folly/Synchronized.h`
  75
  76 The same code sample could be rewritten with `Synchronized`
  77 as follows:
  78
  79 ``` Cpp
  80     class AdPublisherHandler : public AdPopulatorIf,
  81                                public fb303::FacebookBase,
  82                                public ZkBaseApplication {
  83       ...
  84       Synchronized<OnDemandUpdateIdMap>
  85         adsToBeUpdated_,
  86         limitsToBeUpdated_,
  87         campaignsToBeUpdated_;
  88       ...
  89     };
  90
  91     void AdPublisherHandler::requestUpdateAdId(const int64_t adId,
  92                                                const int32_t dbId) {
  93       checkDbHandlingStatus(dbId);
  94       SYNCHRONIZED (adsToBeUpdated_) {
  95         adsToBeUpdated_[dbId][adId] = 1;
  96       }
  97       adPublisherMonitor_->addStatValue("request_adId_update", 1, dbId);
  98       LOG(INFO) << "received request to update ad id " << adId;
  99     }
 100 ```
 101
 102 The rewrite does at maximum efficiency what needs to be done:
 103 acquires the lock associated with the `OnDemandUpdateIdMap`
 104 object, writes to the map, and releases the lock immediately
 105 thereafter.
 106
 107 On the face of it, that's not much to write home about, and not
 108 an obvious improvement over the previous state of affairs. But
 109 the features at work invisible in the code above are as important
 110 as those that are visible:
 111
 112 * Unlike before, the data and the mutex protecting it are
 113   inextricably encapsulated together.
 114 * Critical sections are readily visible and emphasize code that
 115   needs to do minimal work and be subject to extra scrutiny.
 116 * Dangerous nested `SYNCHRONIZED` statements are more visible
 117   than sequenced declarations of guards at the same level. (This
 118   is not foolproof because a method call issued inside a
 119   `SYNCHRONIZED` scope may open its own `SYNCHRONIZED` block.) A
 120   construct `SYNCHRONIZED_DUAL`, discussed later in this
 121   document, allows locking two objects quasi-simultaneously in
 122   the same order in all threads, thus avoiding deadlocks.
 123 * If you tried to use `adsToBeUpdated_` outside the
 124   `SYNCHRONIZED` scope, you wouldn't be able to; it is virtually
 125   impossible to tease the map object without acquiring the
 126   correct lock. However, inside the `SYNCHRONIZED` scope, the
 127   *same* name serves as the actual underlying object of type
 128   `OnDemandUpdateIdMap` (which is a map of maps).
 129 * Outside `SYNCHRONIZED`, if you just want to call one
 130   method, you can do so by using `adsToBeUpdated_` as a
 131   pointer like this:
 132
 133     `adsToBeUpdated_->clear();`
 134
 135 This acquires the mutex, calls `clear()` against the underlying
 136 map object, and releases the mutex immediately thereafter.
 137
 138 `Synchronized` offers several other methods, which are described
 139 in detail below.
 140
 141 ### Template class `Synchronized<T>`
 142
 143 ##### Constructors
 144
 145 The default constructor default-initializes the data and its
 146 associated mutex.
 147
 148
 149 The copy constructor locks the source for reading and copies its
 150 data into the target. (The target is not locked as an object
 151 under construction is only accessed by one thread.)
 152
 153 Finally, `Synchronized<T>` defines an explicit constructor that
 154 takes an object of type `T` and copies it. For example:
 155
 156 ``` Cpp
 157     // Default constructed
 158     Synchronized< map<string, int> > syncMap1;
 159
 160     // Copy constructed
 161     Synchronized< map<string, int> > syncMap2(syncMap1);
 162
 163     // Initializing from an existing map
 164     map<string, int> init;
 165     init["world"] = 42;
 166     Synchronized< map<string, int> > syncMap3(init);
 167     EXPECT_EQ(syncMap3->size(), 1);
 168 ```
 169
 170 #### Assignment, swap, and copying
 171
 172 The canonical assignment operator locks both objects involved and
 173 then copies the underlying data objects. The mutexes are not
 174 copied. The locks are acquired in increasing address order, so
 175 deadlock is avoided. For example, there is no problem if one
 176 thread assigns `a = b` and the other assigns `b = a` (other than
 177 that design probably deserving a Razzie award). Similarly, the
 178 `swap` method takes a reference to another `Synchronized<T>`
 179 object and swaps the data. Again, locks are acquired in a well-
 180 defined order. The mutexes are not swapped.
 181
 182 An additional assignment operator accepts a `const T&` on the
 183 right-hand side. The operator copies the datum inside a
 184 critical section.
 185
 186 In addition to assignment operators, `Synchronized<T>` has move
 187 assignment operators.
 188
 189 An additional `swap` method accepts a `T&` and swaps the data
 190 inside a critical section. This is by far the preferred method of
 191 changing the guarded datum wholesale because it keeps the lock
 192 only for a short time, thus lowering the pressure on the mutex.
 193
 194 To get a copy of the guarded data, there are two methods
 195 available: `void copy(T*)` and `T copy()`. The first copies data
 196 to a provided target and the second returns a copy by value. Both
 197 operations are done under a read lock. Example:
 198
 199 ``` Cpp
 200     Synchronized< fbvector<fbstring> > syncVec1, syncVec2;
 201     fbvector<fbstring> vec;
 202
 203     // Assign
 204     syncVec1 = syncVec2;
 205     // Assign straight from vector
 206     syncVec1 = vec;
 207
 208     // Swap
 209     syncVec1.swap(syncVec2);
 210     // Swap with vector
 211     syncVec1.swap(vec);
 212
 213     // Copy to given target
 214     syncVec1.copy(&vec);
 215     // Get a copy by value
 216     auto copy = syncVec1.copy();
 217 ```
 218
 219 #### `LockedPtr operator->()` and `ConstLockedPtr operator->() const`
 220
 221 We've already seen `operator->` at work. Essentially calling a
 222 method `obj->foo(x, y, z)` calls the method `foo(x, y, z)` inside
 223 a critical section as long-lived as the call itself. For example:
 224
 225 ``` Cpp
 226     void fun(Synchronized< fbvector<fbstring> > & vec) {
 227       vec->push_back("hello");
 228       vec->push_back("world");
 229     }
 230 ```
 231
 232 The code above appends two elements to `vec`, but the elements
 233 won't appear necessarily one after another. This is because in
 234 between the two calls the mutex is released, and another thread
 235 may modify the vector. At the cost of anticipating a little, if
 236 you want to make sure you insert "world" right after "hello", you
 237 should do this:
 238
 239 ``` Cpp
 240     void fun(Synchronized< fbvector<fbstring> > & vec) {
 241       SYNCHRONIZED (vec) {
 242         vec.push_back("hello");
 243         vec.push_back("world");
 244       }
 245     }
 246 ```
 247
 248 This brings us to a cautionary discussion. The way `operator->`
 249 works is rather ingenious with creating an unnamed temporary that
 250 enforces locking and all, but it's not a panacea. Between two
 251 uses of `operator->`, other threads may change the synchronized
 252 object in arbitrary ways, so you shouldn't assume any sort of
 253 sequential consistency. For example, the innocent-looking code
 254 below may be patently wrong.
 255
 256 If another thread clears the vector in between the call to
 257 `empty` and the call to `pop_back`, this code ends up attempting
 258 to extract an element from an empty vector. Needless to say,
 259 iteration a la:
 260
 261 ``` Cpp
 262     // No. NO. NO!
 263     FOR_EACH_RANGE (i, vec->begin(), vec->end()) {
 264       ...
 265     }
 266 ```
 267
 268 is a crime punishable by long debugging nights.
 269
 270 If the `Synchronized<T>` object involved is `const`-qualified,
 271 then you'll only be able to call `const` methods through `operator->`.
 272 So, for example, `vec->push_back("xyz")` won't work if `vec`
 273 were `const`-qualified. The locking mechanism capitalizes on the
 274 assumption that `const` methods don't modify their underlying
 275 data and only acquires a read lock (as opposed to a read and
 276 write lock), which is cheaper but works only if the immutability
 277 assumption holds. Note that this is strictly not the case because
 278 `const`-ness can always be undone via `mutable` members, casts,
 279 and surreptitious access to shared data. Our code is seldom
 280 guilty of such, and we also assume the STL uses no shenanigans.
 281 But be warned.
 282
 283 #### `asConst()`
 284
 285 Consider:
 286
 287 ``` Cpp
 288     void fun(Synchronized<fbvector<fbstring>> & vec) {
 289       if (vec->size() > 1000000) {
 290         LOG(WARNING) << "The blinkenlights are overloaded.";
 291       }
 292       vec->push_back("another blinkenlight");
 293     }
 294 ```
 295
 296 This code is correct (at least according to a trivial intent),
 297 but less efficient than it could otherwise be. This is because
 298 the call `vec->size()` acquires a full read-write lock, but only
 299 needs a read lock. We need to help the type system here by
 300 telling it "even though `vec` is a mutable object, consider it a
 301 constant for this call". This should be easy enough because
 302 conversion to const is trivial - just issue `const_cast<const
 303 Synchronized<fbvector<fbstring>>&>(vec)`. Ouch. To make that
 304 operation simpler - a lot simpler - `Synchronized<T>` defines the
 305 method `asConst()`, which is a glorious one-liner. With `asConst`
 306 in tow, it's very easy to achieve what we wanted:
 307
 308 ``` Cpp
 309     void fun(Synchronized<fbvector<fbstring>> & vec) {
 310       if (vec.asConst()->size() > 1000000) {
 311         LOG(WARNING) << "The blinkenlights are overloaded.";
 312       }
 313       vec->push_back("another blinkenlight");
 314     }
 315 ```
 316
 317 QED (Quite Easy Done). This concludes the documentation for
 318 `Synchronized<T>`.
 319
 320 ### `SYNCHRONIZED`
 321
 322 The `SYNCHRONIZED` macro introduces a pseudo-statement that adds
 323 a whole new level of usability to `Synchronized<T>`. As
 324 discussed, `operator->` can only lock over the duration of a
 325 call, so it is insufficient for complex operations. With
 326 `SYNCHRONIZED` you get to lock the object in a scoped manner (not
 327 unlike Java's `synchronized` statement) and to directly access
 328 the object inside that scope.
 329
 330 `SYNCHRONIZED` has two forms. We've seen the first one a couple
 331 of times already:
 332
 333 ``` Cpp
 334     void fun(Synchronized<fbvector<int>> & vec) {
 335       SYNCHRONIZED (vec) {
 336         vec.push_back(42);
 337         CHECK(vec.back() == 42);
 338         ...
 339       }
 340     }
 341 ```
 342
 343 The scope introduced by `SYNCHRONIZED` is a critical section
 344 guarded by `vec`'s mutex. In addition to doing that,
 345 `SYNCHRONIZED` also does an interesting sleight of hand: it binds
 346 the name `vec` inside the scope to the underlying `fbvector<int>`
 347 object - as opposed to `vec`'s normal type, which is
 348 `Synchronized<fbvector<int>>`. This fits very nice the "form
 349 follow function" - inside the critical section you have earned
 350 access to the actual data, and the name bindings reflect that as
 351 well. `SYNCHRONIZED(xyz)` essentially cracks `xyz` temporarily
 352 and gives you access to its innards.
 353
 354 Now, what if `fun` wants to take a pointer to
 355 `Synchronized<fbvector<int>>` - let's call it `pvec`? Generally,
 356 what if we want to synchronize on an expression as opposed to a
 357 symbolic variable? In that case `SYNCHRONIZED(*pvec)` would not
 358 work because "`*pvec`" is not a name. That's where the second
 359 form of `SYNCHRONIZED` kicks in:
 360
 361 ``` Cpp
 362     void fun(Synchronized<fbvector<int>> * pvec) {
 363       SYNCHRONIZED (vec, *pvec) {
 364         vec.push_back(42);
 365         CHECK(vec.back() == 42);
 366         ...
 367       }
 368     }
 369 ```
 370
 371 Ha, so now we pass two arguments to `SYNCHRONIZED`. The first
 372 argument is the name bound to the data, and the second argument
 373 is the expression referring to the `Synchronized<T>` object. So
 374 all cases are covered.
 375
 376 ### `SYNCHRONIZED_CONST`
 377
 378 Recall from the discussion about `asConst()` that we
 379 sometimes want to voluntarily restrict access to an otherwise
 380 mutable object. The `SYNCHRONIZED_CONST` pseudo-statement
 381 makes that intent easily realizable and visible to
 382 maintainers. For example:
 383
 384 ``` Cpp
 385     void fun(Synchronized<fbvector<int>> & vec) {
 386       fbvector<int> local;
 387       SYNCHRONIZED_CONST (vec) {
 388         CHECK(vec.size() > 42);
 389         local = vec;
 390       }
 391       local.resize(42000);
 392       SYNCHRONIZED (vec) {
 393         local.swap(vec);
 394       }
 395     }
 396 ```
 397
 398 Inside a `SYNCHRONIZED_CONST(xyz)` scope, `xyz` is bound to a `const`-
 399 qualified datum. The corresponding lock is a read lock.
 400
 401 `SYNCHRONIZED_CONST` also has a two-arguments version, just like
 402 `SYNCHRONIZED`. In fact, `SYNCHRONIZED_CONST(a)` simply expands
 403 to `SYNCHRONIZED(a, a.asConst())` and `SYNCHRONIZED_CONST(a, b)`
 404 expands to `SYNCHRONIZED(a, (b).asConst())`. The type system and
 405 `SYNCHRONIZED` take care of the rest.
 406
 407 ### `TIMED_SYNCHRONIZED` and `TIMED_SYNCHRONIZED_CONST`
 408
 409 These pseudo-statements allow you to acquire the mutex with a
 410 timeout. Example:
 411
 412 ``` Cpp
 413     void fun(Synchronized<fbvector<int>> & vec) {
 414       TIMED_SYNCHRONIZED (10, vec) {
 415         if (vec) {
 416           vec->push_back(42);
 417           CHECK(vec->back() == 42);
 418         } else {
 419             LOG(INFO) << "Dognabbit, I've been waiting over here for 10 milliseconds and couldn't get through!";
 420         }
 421       }
 422     }
 423 ```
 424
 425 If the mutex acquisition was successful within a number of
 426 milliseconds dictated by its first argument, `TIMED_SYNCHRONIZED`
 427 binds its second argument to a pointer to the protected object.
 428 Otherwise, the pointer will be `NULL`. (Contrast that with
 429 `SYNCHRONIZED`), which always succeeds so it binds the protected
 430 object to a reference.) Inside the `TIMED_SYNCHRONIZED` statement
 431 you must, of course, make sure the pointer is not null to make
 432 sure the operation didn't time out.
 433
 434 `TIMED_SYNCHRONIZED` takes two or three parameters. The first is
 435 always the timeout, and the remaining one or two are just like
 436 the parameters of `SYNCHRONIZED`.
 437
 438 Issuing `TIMED_SYNCHRONIZED` with a zero timeout is an
 439 opportunistic attempt to acquire the mutex.
 440
 441 ### `UNSYNCHRONIZED`
 442
 443 `SYNCHRONIZED` is a good mechanism for enforcing scoped
 444 synchronization, but it has the inherent limitation that it
 445 requires the critical section to be, well, scoped. Sometimes the
 446 code structure requires a fleeting "escape" from the iron fist of
 447 synchronization. Clearly, simple cases are handled with sequenced
 448 `SYNCHRONIZED` scopes:
 449
 450 ``` Cpp
 451     Synchronized<map<int, string>> dic;
 452     ...
 453     SYNCHRONIZED (dic) {
 454       if (dic.find(0) != dic.end()) {
 455         return;
 456       }
 457     }
 458     LOG(INFO) << "Key 0 not found, inserting it."
 459     SYNCHRONIZED (dic) {
 460       dic[0] = "zero";
 461     }
 462 ```
 463
 464 For more complex, nested flow control, you may want to use the
 465 `UNSYNCHRONIZED` macro. It (only) works inside a `SYNCHRONIZED`
 466 pseudo-statement and temporarily unlocks the mutex:
 467
 468 ``` Cpp
 469
 470     Synchronized<map<int, string>> dic;
 471     ...
 472     SYNCHRONIZED (dic) {
 473       auto i = dic.find(0);
 474       if (i != dic.end()) {
 475         UNSYNCHRONIZED (dic) {
 476           LOG(INFO) << "Key 0 not found, inserting it."
 477         }
 478         dic[0] = "zero";
 479       } else {
 480         *i = "zero";
 481       }
 482     }
 483     LOG(INFO) << "Key 0 not found, inserting it."
 484     SYNCHRONIZED (dic) {
 485       dic[0] = "zero";
 486     }
 487 ```
 488
 489 Clearly `UNSYNCHRONIZED` comes with specific caveats and
 490 liabilities. You must assume that during the `UNSYNCHRONIZED`
 491 section, other threads might have changed the protected structure
 492 in arbitrary ways. In the example above, you cannot use the
 493 iterator `i` and you cannot assume that the key `0` is not in the
 494 map; another thread might have inserted it while you were
 495 bragging on `LOG(INFO)`.
 496
 497 ### `SYNCHRONIZED_DUAL`
 498
 499 Sometimes locking just one object won't be able to cut the mustard. Consider a
 500 function that needs to lock two `Synchronized` objects at the
 501 same time - for example, to copy some data from one to the other.
 502 At first sight, it looks like nested `SYNCHRONIZED` statements
 503 will work just fine:
 504
 505 ``` Cpp
 506     void fun(Synchronized<fbvector<int>> & a, Synchronized<fbvector<int>> & b) {
 507       SYNCHRONIZED (a) {
 508         SYNCHRONIZED (b) {
 509           ... use a and b ...
 510         }
 511       }
 512     }
 513 ```
 514
 515 This code compiles and may even run most of the time, but embeds
 516 a deadly peril: if one threads call `fun(x, y)` and another
 517 thread calls `fun(y, x)`, then the two threads are liable to
 518 deadlocking as each thread will be waiting for a lock the other
 519 is holding. This issue is a classic that applies regardless of
 520 the fact the objects involved have the same type.
 521
 522 This classic problem has a classic solution: all threads must
 523 acquire locks in the same order. The actual order is not
 524 important, just the fact that the order is the same in all
 525 threads. Many libraries simply acquire mutexes in increasing
 526 order of their address, which is what we'll do, too. The pseudo-
 527 statement `SYNCHRONIZED_DUAL` takes care of all details of proper
 528 locking of two objects and offering their innards:
 529
 530 ``` Cpp
 531     void fun(Synchronized<fbvector<int>> & a, Synchronized<fbvector<int>> & b) {
 532       SYNCHRONIZED_DUAL (myA, a, myB, b) {
 533         ... use myA and myB ...
 534       }
 535     }
 536 ```
 537
 538 To avoid potential confusions, `SYNCHRONIZED_DUAL` only defines a
 539 four-arguments version. The code above locks `a` and `b` in
 540 increasing order of their address and offers their data under the
 541 names `myA` and `myB`, respectively.
 542
 543 ### Synchronizing several data items with one mutex
 544
 545 The library is geared at protecting one object of a given type
 546 with a mutex. However, sometimes we'd like to protect two or more
 547 members with the same mutex. Consider for example a bidirectional
 548 map, i.e. a map that holds an `int` to `string` mapping and also
 549 the converse `string` to `int` mapping. The two maps would need
 550 to be manipulated simultaneously. There are at least two designs
 551 that come to mind.
 552
 553 #### Using a nested `struct`
 554
 555 You can easily pack the needed data items in a little struct.
 556 For example:
 557
 558 ``` Cpp
 559     class Server {
 560       struct BiMap {
 561         map<int, string> direct;
 562         map<string, int> inverse;
 563       };
 564       Synchronized<BiMap> bimap_;
 565       ...
 566     };
 567     ...
 568     SYNCHRONIZED (bymap_) {
 569       bymap_.direct[0] = "zero";
 570       bymap_.inverse["zero"] = 0;
 571     }
 572 ```
 573
 574 With this code in tow you get to use `bimap_` just like any other
 575 `Synchronized` object, without much effort.
 576
 577 #### Using `std::tuple`
 578
 579 If you won't stop short of using a spaceship-era approach,
 580 `std::tuple` is there for you. The example above could be
 581 rewritten for the same functionality like this:
 582
 583 ``` Cpp
 584     class Server {
 585       Synchronized<tuple<map<int, string>, map<string, int>>> bimap_;
 586       ...
 587     };
 588     ...
 589     SYNCHRONIZED (bymap_) {
 590       get<0>(bymap_)[0] = "zero";
 591       get<1>(bymap_)["zero"] = 0;
 592     }
 593 ```
 594
 595 The code uses `std::get` with compile-time integers to access the
 596 fields in the tuple. The relative advantages and disadvantages of
 597 using a local struct vs. `std::tuple` are quite obvious - in the
 598 first case you need to invest in the definition, in the second
 599 case you need to put up with slightly more verbose and less clear
 600 access syntax.
 601
 602 ### Summary
 603
 604 `Synchronized` and its supporting tools offer you a simple,
 605 robust paradigm for mutual exclusion-based concurrency. Instead
 606 of manually pairing data with the mutexes that protect it and
 607 relying on convention to use them appropriately, you can benefit
 608 of encapsulation and typechecking to offload a large part of that
 609 task and to provide good guarantees.