[CaptureTracking] Avoid long compilation time on large basic blocks
authorBruno Cardoso Lopes <bruno.cardoso@gmail.com>
Wed, 24 Jun 2015 17:53:17 +0000 (17:53 +0000)
committerBruno Cardoso Lopes <bruno.cardoso@gmail.com>
Wed, 24 Jun 2015 17:53:17 +0000 (17:53 +0000)
commit1b6ca9d0ccb5bb8ac42d2a3bf94288e3b008faeb
tree2909c79df71ed2d6580f6e987052350f4b320134
parentd89b55e3097078fb05edb0336178294d507ce2a8
[CaptureTracking] Avoid long compilation time on large basic blocks

CaptureTracking becomes very expensive in large basic blocks while
calling PointerMayBeCaptured. PointerMayBeCaptured scans the BB the
number of times equal to the number of uses of 'BeforeHere', which is
currently capped at 20 and bails out with Tracker->tooManyUses().

The bottleneck here is the number of calls to PointerMayBeCaptured * the
basic block scan. In a testcase with a 82k instruction BB,
PointerMayBeCaptured is called 130k times, leading to 'shouldExplore'
taking 527k runs, this currently takes ~12min.

To fix this we locally (within PointerMayBeCaptured) number the
instructions in the basic block using a DenseMap to cache instruction
positions/numbers. We build the cache incrementally every time we need
to scan an unexplored part of the BB, improving compile time to only
take ~2min.

This triggers in the flow: DeadStoreElimination -> MepDepAnalysis ->
CaptureTracking.

Side note: after multiple runs in the test-suite I've seen no
performance nor compile time regressions, but could note a couple of
compile time improvements:

Performance Improvements - Compile Time Delta Previous  Current StdDev
SingleSource/Benchmarks/Misc-C++/bigfib -4.48%  0.8547  0.8164  0.0022
MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl -1.47% 1.3912  1.3707  0.0056

Differential Revision: http://reviews.llvm.org/D7010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240560 91177308-0d34-0410-b5e6-96231b3b80d8
include/llvm/Analysis/CFG.h
lib/Analysis/CFG.cpp
lib/Analysis/CaptureTracking.cpp