Programming Languages Research Group: Git

author	Evan Cheng <evan.cheng@apple.com>
	Wed, 19 Dec 2012 20:16:09 +0000 (20:16 +0000)
committer	Evan Cheng <evan.cheng@apple.com>
	Wed, 19 Dec 2012 20:16:09 +0000 (20:16 +0000)
commit	733c6b1db1a9a3f78da4fece933ccc7e509bfba0
tree	0fa73a9897a9efd007125f5a9e6d9fc105d305d4	tree \| snapshot
parent	28d24c95ade0e1fe13d40a068179e61ab5622782	commit \| diff

LLVM sdisel normalize bit extraction of the form:
((x & 0xff00) >> 8) << 2
to
(x >> 6) & 0x3fc

This is general goodness since it folds a left shift into the mask. However,
the trailing zeros in the mask prevents the ARM backend from using the bit
extraction instructions. And worse since the mask materialization may require
an addition instruction. This comes up fairly frequently when the result of
the bit twiddling is used as memory address. e.g.

= ptr[(x & 0xFF0000) >> 16]

We want to generate:
  ubfx   r3, r1, #16, #8
  ldr.w  r3, [r0, r3, lsl #2]

vs.
  mov.w  r9, #1020
  and.w  r2, r9, r1, lsr #14
  ldr    r2, [r0, r2]

Add a late ARM specific isel optimization to
ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the
'base + offset' address computation; change the mask to one which doesn't have
trailing zeros and enable the use of ubfx.

Note the optimization has to be done late since it's target specific and we
don't want to change the DAG normalization. It's also fairly restrictive
as shifter operands are not always free. It's only done for lsh 1 / 2. It's
known to be free on some cpus and they are most common for address
computation.

This is a slight win for blowfish, rijndael, etc.

rdar://12870177

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170581 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/ARM/ARMISelDAGToDAG.cpp		diff \| blob \| history
test/CodeGen/ARM/bfx.ll		diff \| blob \| history