Add support to the local allocator for fusing spill code into the instructions
authorChris Lattner <sabre@nondot.org>
Tue, 17 Feb 2004 08:09:40 +0000 (08:09 +0000)
committerChris Lattner <sabre@nondot.org>
Tue, 17 Feb 2004 08:09:40 +0000 (08:09 +0000)
commit11390e76e73d36e62c981069a67d1d33823262de
tree3a8e3085beea860e71cb73ff32f271aabc2b1520
parent18bd7bb4d453e014c39729c4bf2b43f1846b8a9a
Add support to the local allocator for fusing spill code into the instructions
that need them.  This is very useful on CISCy targets like the X86 because it
reduces the total spill pressure, and makes better use of it's (large)
instruction set.  Though the X86 backend doesn't know how to rewrite many
instructions yet, this already makes a substantial difference on 176.gcc for
example:

Before:
Time:
   8.0099 ( 31.2%)   0.0100 ( 12.5%)   8.0199 ( 31.2%)   7.7186 ( 30.0%)  Local Register Allocator

Code quality:
734559 asm-printer           - Number of machine instrs printed
111395 ra-local              - Number of registers reloaded
 79902 ra-local              - Number of registers spilled
231554 x86-peephole          - Number of peephole optimization performed

After:
Time:
   7.8700 ( 30.6%)   0.0099 ( 19.9%)   7.8800 ( 30.6%)   7.7892 ( 30.2%)  Local Register Allocator
Code quality:
733083 asm-printer           - Number of machine instrs printed
  2379 ra-local              - Number of reloads fused into instructions
109046 ra-local              - Number of registers reloaded
 79881 ra-local              - Number of registers spilled
230658 x86-peephole          - Number of peephole optimization performed

So by fusing 2300 instructions, we reduced the  static number of instructions
by 1500, and reduces the number of peepholes (and thus the work) by about 900.
This also clearly reduces the number of reload/spill instructions that are
emitted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@11542 91177308-0d34-0410-b5e6-96231b3b80d8
lib/CodeGen/RegAllocLocal.cpp