#1: The new bug we found in this data structure is as the following: In dequeue(), the load of Tail has to be acquire. If not, the load of the next field won't get synchronize with the enqueuer that inserted the element right after this dequeue() operation. We can catch this bug using the two independent dequeuers and 1 (or 2) enqueuer. #2: When writing the speicification for this data structure, we have been considering the following invariance of the enqueue() and dequeue() operations. For enqueue() to succeed, it finally loads the Tail, inserts the new element by updating the next field right after the Tail load. The load of Tail actually orders the enqueuers, and the CAS of the next field will order the enqueuer and its dequeuer. For dequeue() to succeed, it either loads the Head and Tail, and then bails; or it loads the Head and then loads the next field. Similarly, the load of Head orders dequeuers. We have one key "additional_ordering_point" in enqueue() method for ordering the dequeuer and its immediately next enqueuer. For example, when we have two threads, 1 dequeuer and 1 enqueuer, when dequeue() returns NULL, we can order dequeue() before enqueue() by using the additional ordering point of enqueue() where the swing of Tail (or just the load of Tail when someone else swings it). Since dequeue() will load the Tail, and enqueue() will CAS or load the Tail later, deuqeue() can be ordered() before enqueue(). Notably here, additional ordering points are only used when methods has 'same-location' commit points and they can't order each other.