X-Git-Url: http://plrg.eecs.uci.edu/git/?a=blobdiff_plain;f=folly%2Fdocs%2FFBVector.md;h=792aa0a02424849ecdcf48714a5ca08d2ae4a809;hb=78443c5a20e395eb6933540c72e05ad1ea77ea0d;hp=9a238a9081ee561a429385b8f7ae6a1f19dc060c;hpb=40a24cb614edf84907d9971462268b5518dec52e;p=folly.git diff --git a/folly/docs/FBVector.md b/folly/docs/FBVector.md index 9a238a90..792aa0a0 100644 --- a/folly/docs/FBVector.md +++ b/folly/docs/FBVector.md @@ -1,4 +1,4 @@ -`folly/FBvector.h` +`folly/FBVector.h` ------------------ Simply replacing `std::vector` with `folly::fbvector` (after @@ -35,31 +35,29 @@ folly/test/FBVectorTest.cpp for a few benchmarks. It is well known that `std::vector` grows exponentially (at a constant factor) in order to avoid quadratic growth performance. -The trick is choosing a good factor (any factor greater than 1 -ensures O(1) amortized append complexity towards infinity). A -factor that's too small causes frequent vector reallocation; one -that's too large forces the vector to consume much more memory -than needed. The initial HP implementation by Stepanov used a -growth factor of 2, i.e. whenever you'd `push_back` into a vector -without there being room, it would double the current capacity. - -With time, other compilers reduced the growth factor to 1.5, but -gcc has staunchly used a growth factor of 2. In fact it can be -mathematically proven that a growth factor of 2 is rigorously the -worst possible because it never allows the vector to reuse -any of its previously-allocated memory. That makes the vector cache- -unfriendly and memory manager unfriendly. - -To see why that's the case, consider a large vector of capacity C -residing somewhere at the beginning of an initially unoccupied -chunk. When the request for growth comes about, the vector +The trick is choosing a good factor. Any factor greater than 1 +ensures O(1) amortized append complexity towards infinity. But a +factor that's too small (say, 1.1) causes frequent vector reallocation, and +one that's too large (say, 3 or 4) forces the vector to consume much more +memory than needed. + +The initial HP implementation by Stepanov used a +growth factor of 2; i.e., whenever you'd `push_back` into a vector +without there being room, it would double the current capacity. This +was not a good choice: it can be mathematically proven that a growth factor of +2 is rigorously the worst possible because it never allows the vector +to reuse any of its previously-allocated memory. Despite other compilers +reducing the growth factor to 1.5, gcc has staunchly maintained its factor of +2. This makes `std::vector` cache- unfriendly and memory manager unfriendly. + +To see why that's the case, consider a large vector of capacity C. +When there's a request to grow the vector, the vector (assuming no in-place resizing, see the appropriate section in -this document) will allocate a chunk next to its current chunk, -copy its existing data, and then deallocate the old chunk. So now -we have a chunk of size C followed by a chunk of size k * C. -Continuing this process we'll then have a chunk of size k * k * C -to the right and so on. That leads to a series of the form (using -^^ for power): +this document) will allocate a chunk of memory next to its current chunk, +copy its existing data to the new chunk, and then deallocate the old chunk. +So now we have a chunk of size C followed by a chunk of size k * C. Continuing +this process we'll then have a chunk of size k * k * C to the right and so on. +That leads to a series of the form (using ^^ for power): C, C*k, C*k^^2, C*k^^3, ... @@ -69,26 +67,13 @@ the remarkable equality: 1 + 2^^1 + 2^^2 + 2^^3... + 2^^n = 2^^(n+1) - 1 -What that really means is that the new request for a chunk will -be never satisfiable by coalescing all previously-used chunks. -This is not quite what you'd want. - -We would of course want the vector to not crawl forward in -memory, but instead to move back to its previously-allocated -chunks. Any number smaller than 2 guarantees that you'll be able -at some point to reuse the previous chunks. Going through the -math reveals the equation: - - k^^n <= 1 + k + k^^2 + ... + k^^(n-2) - -If some number n satisfies that equation, it means you can reuse -memory after n reallocations. The graphical solver below reveals -that choosing k = 1.5 (blue line) allows memory reuse after 4 -reallocations, choosing k = 1.45 (red line) allows memory reuse -after 3 reallocations, and choosing k = 1.3 (black line) allows -reuse after only 2 reallocations. - -![graphical solutions](./Fbvector--graphical_solutions.png) +This means that any new chunk requested will be larger +than all previously used chunks combined, so the vector must +crawl forward in memory; it can't move back to its deallocated chunks. +But any number smaller than 2 guarantees that you'll at some point be +able to reuse the previous chunks. For instance, choosing 1.5 as the factor +allows memory reuse after 4 reallocations; 1.45 allows memory reuse after 3 +reallocations; and 1.3 allows reuse after only 2 reallocations. Of course, the above makes a number of simplifying assumptions about how the memory allocator works, but definitely you don't @@ -128,7 +113,7 @@ the now notorious design of `realloc()` to opaquely perform either in-place reallocation or an allocate-memcpy-deallocate cycle. Such lack of control subsequently forced all clib-based allocator designs to avoid in-place reallocation, and that -includes C++'s `new` and `std:allocator`. This is a major loss of +includes C++'s `new` and `std::allocator`. This is a major loss of efficiency because an in-place reallocation, being very cheap, may mean a much less aggressive growth strategy. In turn that means less slack memory and faster reallocations. @@ -170,6 +155,7 @@ Only a tiny minority of objects are genuinely non-relocatable: public: Ew() : pointerInsideBuffer(buffer) {} ... + } * Objects that need to update "observers" that store pointers to them. @@ -201,32 +187,7 @@ after `Widget`'s definition and write this: } If you don't do this, `fbvector` will fail to compile -with a `BOOST_STATIC_ASSERT`. - -#### Additional Constraints - -Similar improvements are possible in presence of a "simple" type -- more specifically, one that has a trivial assignment (i.e. -assignment is the same as bitblitting the bits over) or a nothrow -default constructor. These traits are used gainfully by -`fbvector` in a variety of places. Fortunately, these traits are -already present in the C++ standard (well, currently in Boost). -To summarize, in order to work with `fbvector`, a type `Widget` -must pass: - - BOOST_STATIC_ASSERT( - IsRelocatable::value && - (boost::has_trivial_assign::value || boost::has_nothrow_constructor::value)); - -These traits go hand in hand; for example, it would be very -difficult to design a class that satisfies one branch of the -conjunction above but not the other. `fbvector` uses these simple -constraints to minimize the number of copies made on many common -operations such as `push_back`, `insert`, or `resize`. - -To make it easy for you to state assumptions about a given type -or family of parameterized types, check Traits.h and in -particular handy family of macros FOLLY_ASSUME_FBVECTOR_COMPATIBLE*. +with a `static_assert`. ### Miscellaneous *** @@ -237,5 +198,3 @@ directions may be in improving raw memory copying (`memcpy` is not an intrinsic in gcc and does not work terribly well for large chunks) and in furthering the collaboration with jemalloc. Have fun! - -