Add a fast path to folly::ThreadLocal
Summary:
Currently folly::ThreadLocal[Ptr] is pretty heavy-weight for a get():
1) call instance(), take a static init guard, branch
2) call getThreadEntry, check if thread_local is not null, branch
3) check if id < threadEntry->capacity, branch
4) Finally, return threadEntry->elements[id]
If we have real thread_locals, we can do better by caching the capacity directly,
combining all three checks:
1) checkif id < threadLocalCapacityCheck, branch. If not, do slow path.
2) return threadEntry->elements[id]. Threadentry is never null if capacity > 0, and
instance() setup work is called during the first getThreadEntry call when threadlocalcapacity == 0.
Reviewed By: yfeldblum
Differential Revision:
D6379878
fbshipit-source-id:
4fc7564bbb2f319d65875124026aef28d910ef06