3 #ifndef CDSLIB_CONTAINER_IMPL_MULTILEVEL_HASHMAP_H
4 #define CDSLIB_CONTAINER_IMPL_MULTILEVEL_HASHMAP_H
6 #include <cds/intrusive/impl/multilevel_hashset.h>
7 #include <cds/container/details/multilevel_hashmap_base.h>
8 #include <cds/container/details/guarded_ptr_cast.h>
10 namespace cds { namespace container {
12 /// Hash map based on multi-level array
13 /** @ingroup cds_nonintrusive_map
14 @anchor cds_container_MultilevelHashMap_hp
17 - [2013] Steven Feldman, Pierre LaBorde, Damian Dechev "Concurrent Multi-level Arrays:
18 Wait-free Extensible Hash Maps"
20 [From the paper] The hardest problem encountered while developing a parallel hash map is how to perform
21 a global resize, the process of redistributing the elements in a hash map that occurs when adding new
22 buckets. The negative impact of blocking synchronization is multiplied during a global resize, because all
23 threads will be forced to wait on the thread that is performing the involved process of resizing the hash map
24 and redistributing the elements. \p %MultilevelHashSet implementation avoids global resizes through new array
25 allocation. By allowing concurrent expansion this structure is free from the overhead of an explicit resize,
26 which facilitates concurrent operations.
28 The presented design includes dynamic hashing, the use of sub-arrays within the hash map data structure;
29 which, in combination with <b>perfect hashing</b>, means that each element has a unique final, as well as current, position.
30 It is important to note that the perfect hash function required by our hash map is trivial to realize as
31 any hash function that permutes the bits of the key is suitable. This is possible because of our approach
32 to the hash function; we require that it produces hash values that are equal in size to that of the key.
33 We know that if we expand the hash map a fixed number of times there can be no collision as duplicate keys
34 are not provided for in the standard semantics of a hash map.
36 \p %MultiLevelHashMap is a multi-level array which has an internal structure similar to a tree:
37 @image html multilevel_hashset.png
38 The multi-level array differs from a tree in that each position on the tree could hold an array of nodes or a single node.
39 A position that holds a single node is a \p dataNode which holds the hash value of a key and the value that is associated
\r
40 with that key; it is a simple struct holding two variables. A \p dataNode in the multi-level array could be marked.
\r
41 A \p markedDataNode refers to a pointer to a \p dataNode that has been bitmarked at the least significant bit (LSB)
\r
42 of the pointer to the node. This signifies that this \p dataNode is contended. An expansion must occur at this node;
\r
43 any thread that sees this \p markedDataNode will try to replace it with an \p arrayNode; which is a position that holds
\r
44 an array of nodes. The pointer to an \p arrayNode is differentiated from that of a pointer to a \p dataNode by a bitmark
\r
45 on the second-least significant bit.
\r
47 \p %MultiLevelHashMap multi-level array is similar to a tree in that we keep a pointer to the root, which is a memory array
\r
48 called \p head. The length of the \p head memory array is unique, whereas every other \p arrayNode has a uniform length;
\r
49 a normal \p arrayNode has a fixed power-of-two length equal to the binary logarithm of a variable called \p arrayLength.
\r
50 The maximum depth of the tree, \p maxDepth, is the maximum number of pointers that must be followed to reach any node.
\r
51 We define \p currentDepth as the number of memory arrays that we need to traverse to reach the \p arrayNode on which
\r
52 we need to operate; this is initially one, because of \p head.
\r
54 That approach to the structure of the hash map uses an extensible hashing scheme; <b> the hash value is treated as a bit
\r
55 string</b> and rehash incrementally.
\r
57 @note Two important things you should keep in mind when you're using \p %MultiLevelHashMap:
\r
58 - all keys is converted to fixed-size bit-string by hash functor provided.
\r
59 You can use variable-length keys, for example, \p std::string as a key for \p %MultiLevelHashMap,
\r
60 but real key in the map will be fixed-ize hash values of your keys.
\r
61 For the strings you may use well-known hashing algorithms like <a href="https://en.wikipedia.org/wiki/Secure_Hash_Algorithm">SHA1, SHA2</a>,
\r
62 <a href="https://en.wikipedia.org/wiki/MurmurHash">MurmurHash</a>, <a href="https://en.wikipedia.org/wiki/CityHash">CityHash</a>
\r
63 or its successor <a href="https://code.google.com/p/farmhash/">FarmHash</a> and so on, which
\r
64 converts variable-length strings to fixed-length bit-strings, and such hash values will be the keys in \p %MultiLevelHashMap.
\r
65 - \p %MultiLevelHashMap uses a perfect hashing. It means that if two different keys, for example, of type \p std::string,
\r
66 have identical hash then you cannot insert both that keys in the map. \p %MultiLevelHashMap does not maintain the key,
\r
67 it maintains its fixed-size hash value.
\r
69 The map supports @ref cds_container_MultilevelHashMap_iterators "bidirectional thread-safe iterators".
\r
71 Template parameters:
\r
72 - \p GC - safe memory reclamation schema. Can be \p gc::HP, \p gc::DHP or one of \ref cds_urcu_type "RCU type"
\r
73 - \p Key - a key type to be stored in the map
\r
74 - \p T - a value type to be stored in the map
\r
75 - \p Traits - type traits, the structure based on \p multilevel_hashmap::traits or result of \p multilevel_hashmap::make_traits metafunction.
\r
77 There are several specializations of \p %MultiLevelHashMap for each \p GC. You should include:
78 - <tt><cds/container/multilevel_hashmap_hp.h></tt> for \p gc::HP garbage collector
79 - <tt><cds/container/multilevel_hashmap_dhp.h></tt> for \p gc::DHP garbage collector
80 - <tt><cds/container/multilevel_hashmap_rcu.h></tt> for \ref cds_intrusive_MultiLevelHashSet_rcu "RCU type". RCU specialization
81 has a slightly different interface.
87 #ifdef CDS_DOXYGEN_INVOKED
88 ,class Traits = multilevel_hashmap::traits
93 class MultiLevelHashMap
94 #ifdef CDS_DOXYGEN_INVOKED
95 : protected cds::intrusive::MultiLevelHashSet< GC, std::pair<Key const, T>, Traits >
97 : protected cds::container::details::make_multilevel_hashmap< GC, Key, T, Hasher, Traits >::type
101 typedef cds::container::details::make_multilevel_hashmap< GC, Key, T, Hasher, Traits > maker;
102 typedef typename maker::type base_class;
105 typedef GC gc; ///< Garbage collector
106 typedef Key key_type; ///< Key type
107 typedef T mapped_type; ///< Mapped type
108 typedef std::pair< key_type const, mapped_type> value_type; ///< Key-value pair to be stored in the map
109 typedef Traits traits; ///< Map traits
110 #ifdef CDS_DOXYGEN_INVOKED
111 typedef typename traits::hash hasher; ///< Hash functor, see \p multilevel_hashmap::traits::hash
113 typedef typename maker::hasher hasher;
116 typedef typename maker::hash_type hash_type; ///< Hash type deduced from \p hasher return type
117 typedef typename base_class::hash_comparator hash_comparator; ///< hash compare functor based on \p Traits::compare and \p Traits::less
119 typedef typename traits::item_counter item_counter; ///< Item counter type
120 typedef typename traits::allocator allocator; ///< Element allocator
121 typedef typename traits::node_allocator node_allocator; ///< Array node allocator
122 typedef typename traits::memory_model memory_model; ///< Memory model
123 typedef typename traits::back_off back_off; ///< Backoff strategy
124 typedef typename traits::stat stat; ///< Internal statistics type
126 /// Count of hazard pointers required
127 static CDS_CONSTEXPR size_t const c_nHazardPtrCount = base_class::c_nHazardPtrCount;
131 typedef typename maker::node_type node_type;
132 typedef typename maker::cxx_node_allocator cxx_node_allocator;
133 typedef std::unique_ptr< node_type, typename maker::node_disposer > scoped_node_ptr;
135 template <class Iterator>
136 class bidirectional_iterator : protected Iterator
138 friend class MultiLevelHashMap;
139 typedef Iterator base_class;
141 typedef typename std::conditional< base_class::c_bConstantIterator, value_type const*, value_type*>::type value_ptr; ///< Value pointer
142 typedef typename std::conditional< base_class::c_bConstantIterator, value_type const&, value_type&>::type value_ref; ///< Value reference
145 bidirectional_iterator() CDS_NOEXCEPT
149 bidirectional_iterator( bidirectional_iterator const& rhs ) CDS_NOEXCEPT
153 bidirectional_iterator& operator=(bidirectional_iterator const& rhs) CDS_NOEXCEPT
155 base_class::operator=( rhs );
159 bidirectional_iterator& operator++()
161 base_class::operator++();
165 bidirectional_iterator& operator--()
167 base_class::operator--();
171 value_ptr operator ->() const CDS_NOEXCEPT
176 value_ref operator *() const CDS_NOEXCEPT
178 value_ptr p = pointer();
185 base_class::release();
189 bool operator ==(It const& rhs) const CDS_NOEXCEPT
191 return base_class::operator==( rhs );
195 bool operator !=(It const& rhs) const CDS_NOEXCEPT
197 return !( *this == rhs );
201 bidirectional_iterator( base_class&& it ) CDS_NOEXCEPT
205 value_ptr pointer() const CDS_NOEXCEPT
207 typename base_class::value_ptr p = base_class::pointer();
208 return p ? &p->m_Value : nullptr;
211 node_type * node_pointer() const CDS_NOEXCEPT
213 return base_class::pointer();
219 #ifdef CDS_DOXYGEN_INVOKED
221 typedef typename gc::template guarded_ptr< value_type > guarded_ptr;
223 typedef typename gc::template guarded_ptr< node_type, value_type, cds::details::guarded_ptr_cast_set<node_type, value_type> > guarded_ptr;
226 #ifdef CDS_DOXYGEN_INVOKED
227 typedef implementation_defined iterator; ///< @ref cds_container_MultilevelHashMap_iterators "bidirectional iterator" type
228 typedef implementation_defined const_iterator; ///< @ref cds_container_MultilevelHashMap_iterators "bidirectional const iterator" type
229 typedef implementation_defined reverse_iterator; ///< @ref cds_container_MultilevelHashMap_iterators "bidirectional reverse iterator" type
230 typedef implementation_defined const_reverse_iterator; ///< @ref cds_container_MultilevelHashMap_iterators "bidirectional reverse const iterator" type
232 typedef bidirectional_iterator<typename base_class::iterator> iterator;
233 typedef bidirectional_iterator<typename base_class::const_iterator> const_iterator;
234 typedef bidirectional_iterator<typename base_class::reverse_iterator> reverse_iterator;
235 typedef bidirectional_iterator<typename base_class::const_reverse_iterator> const_reverse_iterator;
244 /// Creates empty map
246 @param head_bits: 2<sup>head_bits</sup> specifies the size of head array, minimum is 4.
247 @param array_bits: 2<sup>array_bits</sup> specifies the size of array node, minimum is 2.
249 Equation for \p head_bits and \p array_bits:
251 sizeof(hash_type) * 8 == head_bits + N * array_bits
253 where \p N is multi-level array depth.
255 MultiLevelHashMap( size_t head_bits = 8, size_t array_bits = 4 )
256 : base_class( head_bits, array_bits )
259 /// Destructs the map and frees all data
263 /// Inserts new element with key and default value
265 The function creates an element with \p key and default value, and then inserts the node created into the map.
268 - The \p key_type should be constructible from a value of type \p K.
269 In trivial case, \p K is equal to \p key_type.
270 - The \p mapped_type should be default-constructible.
272 Returns \p true if inserting successful, \p false otherwise.
274 template <typename K>
275 bool insert( K const& key )
277 scoped_node_ptr sp( cxx_node_allocator().New( m_Hasher, key ));
278 if ( base_class::insert( *sp )) {
285 /// Inserts new element
287 The function creates a node with copy of \p val value
288 and then inserts the node created into the map.
291 - The \p key_type should be constructible from \p key of type \p K.
292 - The \p value_type should be constructible from \p val of type \p V.
294 Returns \p true if \p val is inserted into the map, \p false otherwise.
296 template <typename K, typename V>
297 bool insert( K const& key, V const& val )
299 scoped_node_ptr sp( cxx_node_allocator().New( m_Hasher, key, val ));
300 if ( base_class::insert( *sp )) {
307 /// Inserts new element and initialize it by a functor
309 This function inserts new element with key \p key and if inserting is successful then it calls
310 \p func functor with signature
313 void operator()( value_type& item );
317 The argument \p item of user-defined functor \p func is the reference
318 to the map's item inserted:
319 - <tt>item.first</tt> is a const reference to item's key that cannot be changed.
320 - <tt>item.second</tt> is a reference to item's value that may be changed.
322 \p key_type should be constructible from value of type \p K.
324 The function allows to split creating of new item into two part:
325 - create item from \p key;
326 - insert new item into the map;
327 - if inserting is successful, initialize the value of item by calling \p func functor
329 This can be useful if complete initialization of object of \p value_type is heavyweight and
330 it is preferable that the initialization should be completed only if inserting is successful.
332 template <typename K, typename Func>
333 bool insert_with( K const& key, Func func )
335 scoped_node_ptr sp( cxx_node_allocator().New( m_Hasher, key ));
336 if ( base_class::insert( *sp, [&func]( node_type& item ) { func( item.m_Value ); } )) {
343 /// For key \p key inserts data of type \p value_type created in-place from <tt>std::forward<Args>(args)...</tt>
345 Returns \p true if inserting successful, \p false otherwise.
347 template <typename K, typename... Args>
348 bool emplace( K&& key, Args&&... args )
350 scoped_node_ptr sp( cxx_node_allocator().New( m_Hasher, std::forward<K>(key), std::forward<Args>(args)... ));
351 if ( base_class::insert( *sp )) {
358 /// Updates data by \p key
360 The operation performs inserting or changing data with lock-free manner.
362 If the \p key not found in the map, then the new item created from \p key
363 will be inserted into the map iff \p bInsert is \p true
364 (note that in this case the \ref key_type should be constructible from type \p K).
365 Otherwise, if \p key is found, the functor \p func is called with item found.
366 The functor \p Func signature:
369 void operator()( bool bNew, value_type& item );
373 - \p bNew - \p true if the item has been inserted, \p false otherwise
374 - \p item - item of the map
376 The functor may change any fields of the \p item.second that is \ref value_type.
378 Returns <tt> std::pair<bool, bool> </tt> where \p first is \p true if operation is successfull,
379 \p second is \p true if new item has been added or \p false if \p key already exists.
381 @warning See \ref cds_intrusive_item_creating "insert item troubleshooting"
383 template <typename K, typename Func>
384 std::pair<bool, bool> update( K const& key, Func func, bool bInsert = true )
386 scoped_node_ptr sp( cxx_node_allocator().New( m_Hasher, key ));
387 std::pair<bool, bool> result = base_class::do_update( *sp, [&func]( bool bNew, node_type& node ) { func( bNew, node.m_Value );}, bInsert );
393 /// Delete \p key from the map
395 \p key_type must be constructible from value of type \p K.
396 The function deeltes the element with hash value equal to <tt>hash( key_type( key ))</tt>
398 Return \p true if \p key is found and deleted, \p false otherwise.
400 template <typename K>
401 bool erase( K const& key )
403 hash_type h = m_Hasher( key_type( key ));
404 return base_class::erase( h );
407 /// Delete \p key from the map
409 The function searches an item with hash value equal to <tt>hash( key_type( key ))</tt>,
410 calls \p f functor and deletes the item. If \p key is not found, the functor is not called.
412 The functor \p Func interface:
415 void operator()(value_type& item) { ... }
418 where \p item is the element found.
420 \p key_type must be constructible from value of type \p K.
422 Return \p true if key is found and deleted, \p false otherwise
424 template <typename K, typename Func>
425 bool erase( K const& key, Func f )
427 hash_type h = m_Hasher( key_type( key ));
428 return base_class::erase( h, [&f]( node_type& node) { f( node.m_Value ); } );
431 /// Deletes the element pointed by iterator \p iter
433 Returns \p true if the operation is successful, \p false otherwise.
435 The function does not invalidate the iterator, it remains valid and can be used for further traversing.
437 bool erase_at( iterator const& iter )
439 return base_class::erase_at( iter );
442 /// Extracts the item from the map with specified \p key
444 The function searches an item with key equal to <tt>hash( key_type( key ))</tt> in the map,
445 unlinks it from the map, and returns a guarded pointer to the item found.
446 If \p key is not found the function returns an empty guarded pointer.
448 The item extracted is freed automatically by garbage collector \p GC
449 when returned \p guarded_ptr object will be destroyed or released.
450 @note Each \p guarded_ptr object uses the GC's guard that can be limited resource.
454 typedef cds::container::MultiLevelHashMap< cds::gc::HP, int, foo, my_traits > map_type;
458 map_type::guarded_ptr gp( theMap.extract( 5 ));
463 // Destructor of gp releases internal HP guard and frees the pointer
467 template <typename K>
468 guarded_ptr extract( K const& key )
471 typename gc::Guard guard;
472 node_type * p = base_class::do_erase( m_Hasher( key_type( key )), guard, []( node_type const&) -> bool {return true;} );
474 // p is guarded by HP
480 /// Extracts the item pointed by the iterator \p iter
482 The item extracted is freed automatically by garbage collector \p GC
483 when returned \p guarded_ptr object will be destroyed or released.
485 @note Each \p guarded_ptr object uses the GC's guard that can be limited resource.
487 Due to concurrent nature of the map the returned guarded pointer may be empty.
488 Check it before dereferencing.
490 guarded_ptr extract_at( iterator const& iter )
493 if ( base_class::erase_at( iter )) {
494 // The element erased is guarded by iter so it is still alive
495 gp.reset( iter.node_pointer());
500 /// Checks whether the map contains \p key
502 The function searches the item by its hash that is equal to <tt>hash( key_type( key ))</tt>
503 and returns \p true if it is found, or \p false otherwise.
505 template <typename K>
506 bool contains( K const& key )
508 return base_class::contains( m_Hasher( key_type( key )) );
511 /// Find the key \p key
514 The function searches the item by its hash that is equal to <tt>hash( key_type( key ))</tt>
515 and calls the functor \p f for item found.
516 The interface of \p Func functor is:
519 void operator()( value_type& item );
522 where \p item is the item found.
524 The functor may change \p item.second.
526 The function returns \p true if \p key is found, \p false otherwise.
528 template <typename K, typename Func>
529 bool find( K const& key, Func f )
531 return base_class::find( m_Hasher( key_type( key )), [&f](node_type& node) { f( node.m_Value );});
534 /// Finds the key \p key and return the item found
536 The function searches the item with a hash equal to <tt>hash( key_type( key ))</tt>
537 and returns a guarded pointer to the item found.
538 If \p key is not found the function returns an empty guarded pointer.
540 It is safe when a concurrent thread erases the item returned as \p guarded_ptr.
541 In this case the item will be freed later by garbage collector \p GC automatically
542 when \p guarded_ptr object will be destroyed or released.
543 @note Each \p guarded_ptr object uses one GC's guard which can be limited resource.
547 typedef cds::container::MultiLevelHashMap< cds::gc::HP, int, foo, my_traits > map_type;
551 map_type::guarded_ptr gp( theMap.get( 5 ));
556 // Destructor of guarded_ptr releases internal HP guard
560 template <typename K>
561 guarded_ptr get( K const& key )
565 typename gc::Guard guard;
566 gp.reset( base_class::search( m_Hasher( key_type( key )), guard ));
571 /// Clears the map (non-atomic)
573 The function unlink all data node from the map.
574 The function is not atomic but is thread-safe.
575 After \p %clear() the map may not be empty because another threads may insert items.
582 /// Checks if the map is empty
584 Emptiness is checked by item counting: if item count is zero then the map is empty.
585 Thus, the correct item counting feature is an important part of the map implementation.
589 return base_class::empty();
592 /// Returns item count in the map
595 return base_class::size();
598 /// Returns const reference to internal statistics
599 stat const& statistics() const
601 return base_class::statistics();
604 /// Returns the size of head node
605 size_t head_size() const
607 return base_class::head_size();
610 /// Returns the size of the array node
611 size_t array_node_size() const
613 return base_class::array_node_size();
617 ///@name Thread-safe iterators
618 /** @anchor cds_container_MultilevelHashMap_iterators
619 The map supports thread-safe iterators: you may iterate over the map in multi-threaded environment.
\r
620 It is guaranteed that the iterators will remain valid even if another thread deletes the node the iterator points to:
\r
621 Hazard Pointer embedded into the iterator object protects the node from physical reclamation.
\r
623 @note Since the iterator object contains hazard pointer that is a thread-local resource,
\r
624 the iterator should not be passed to another thread.
\r
626 Each iterator object supports the common interface:
\r
627 - dereference operators:
\r
629 value_type [const] * operator ->() noexcept
\r
630 value_type [const] & operator *() noexcept
\r
632 - pre-increment and pre-decrement. Post-operators is not supported
\r
633 - equality operators <tt>==</tt> and <tt>!=</tt>.
\r
634 Iterators are equal iff they point to the same cell of the same array node.
635 Note that for two iterators \p it1 and \p it2, the conditon <tt> it1 == it2 </tt>
636 does not entail <tt> &(*it1) == &(*it2) </tt>
637 - helper member function \p release() that clears internal hazard pointer.
\r
638 After \p release() call the iterator points to \p nullptr but it still remain valid: further iterating is possible.
641 /// Returns an iterator to the beginning of the map
644 return iterator( base_class::begin() );
647 /// Returns an const iterator to the beginning of the map
648 const_iterator begin() const
650 return const_iterator( base_class::begin());
653 /// Returns an const iterator to the beginning of the map
654 const_iterator cbegin()
656 return const_iterator( base_class::cbegin());
659 /// Returns an iterator to the element following the last element of the map. This element acts as a placeholder; attempting to access it results in undefined behavior.
662 return iterator( base_class::end());
665 /// Returns a const iterator to the element following the last element of the map. This element acts as a placeholder; attempting to access it results in undefined behavior.
666 const_iterator end() const
668 return const_iterator( base_class::end());
671 /// Returns a const iterator to the element following the last element of the map. This element acts as a placeholder; attempting to access it results in undefined behavior.
672 const_iterator cend()
674 return const_iterator( base_class::cend());
677 /// Returns a reverse iterator to the first element of the reversed map
678 reverse_iterator rbegin()
680 return reverse_iterator( base_class::rbegin());
683 /// Returns a const reverse iterator to the first element of the reversed map
684 const_reverse_iterator rbegin() const
686 return const_reverse_iterator( base_class::rbegin());
689 /// Returns a const reverse iterator to the first element of the reversed map
690 const_reverse_iterator crbegin()
692 return const_reverse_iterator( base_class::crbegin());
695 /// Returns a reverse iterator to the element following the last element of the reversed map
697 It corresponds to the element preceding the first element of the non-reversed container.
698 This element acts as a placeholder, attempting to access it results in undefined behavior.
700 reverse_iterator rend()
702 return reverse_iterator( base_class::rend());
705 /// Returns a const reverse iterator to the element following the last element of the reversed map
707 It corresponds to the element preceding the first element of the non-reversed container.
708 This element acts as a placeholder, attempting to access it results in undefined behavior.
710 const_reverse_iterator rend() const
712 return const_reverse_iterator( base_class::rend());
715 /// Returns a const reverse iterator to the element following the last element of the reversed map
717 It corresponds to the element preceding the first element of the non-reversed container.
718 This element acts as a placeholder, attempting to access it results in undefined behavior.
720 const_reverse_iterator crend()
722 return const_reverse_iterator( base_class::crend());
727 }} // namespace cds::container
729 #endif // #ifndef CDSLIB_CONTAINER_IMPL_MULTILEVEL_HASHMAP_H