X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 ...
authorBenjamin Kramer <benny.kra@googlemail.com>
Fri, 22 Apr 2011 15:30:40 +0000 (15:30 +0000)
committerBenjamin Kramer <benny.kra@googlemail.com>
Fri, 22 Apr 2011 15:30:40 +0000 (15:30 +0000)
commitb20a8fc8a6bf57dbde0e9238cf535abb4326dc80
tree5ae0148f9eb1ca78f9934cdc6a0d618f89a8946c
parenteab631362d676c0113e052cc7e877eef4da544b8
X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)

This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
uint64_t foo(uint64_t x) { return (x&1) << 42; }
which used to compile into bloated code:
shlq $42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
movabsq $4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
andq %rdi, %rax              ## encoding: [0x48,0x21,0xf8]
ret                             ## encoding: [0xc3]

with this patch we can fold the immediate into the and:
andq $1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
movq %rdi, %rax              ## encoding: [0x48,0x89,0xf8]
shlq $42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
ret                             ## encoding: [0xc3]

It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
that without making this code even more complicated. See the TODOs in the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129990 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelDAGToDAG.cpp
test/CodeGen/X86/narrow-shl-cst.ll [new file with mode: 0644]