[x86] Teach the decomposed shuffle/blend lowering to use an early blend
authorChandler Carruth <chandlerc@gmail.com>
Sun, 15 Feb 2015 12:42:15 +0000 (12:42 +0000)
committerChandler Carruth <chandlerc@gmail.com>
Sun, 15 Feb 2015 12:42:15 +0000 (12:42 +0000)
commitfbde8bffba6b7ccd461fa82dc812bc3f3b609b1a
treeb4f27d56ee42ba3d97215b4137395421e204e1a4
parent72753f87f2b80d66cfd7ca7c7b6c0db6737d4b24
[x86] Teach the decomposed shuffle/blend lowering to use an early blend
when that will allow it to lower with a single permute instead of
multiple permutes.

It tries to detect when it will only have to do a single permute in
either case to maximize folding of loads and such.

This cuts a *lot* of the avx2 shuffle permute counts in half. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229309 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/vector-shuffle-256-v16.ll
test/CodeGen/X86/vector-shuffle-256-v32.ll
test/CodeGen/X86/vector-shuffle-256-v4.ll
test/CodeGen/X86/vector-shuffle-256-v8.ll
test/CodeGen/X86/vector-shuffle-512-v8.ll