Added more insertps optimizations
authorFilipe Cabecinhas <me@filcab.net>
Mon, 19 May 2014 19:45:57 +0000 (19:45 +0000)
committerFilipe Cabecinhas <me@filcab.net>
Mon, 19 May 2014 19:45:57 +0000 (19:45 +0000)
commitca162faee2e8f9e9f8938b164a0e959307194baa
treede63c1a509bacc0839608b29d2c83105172ad611
parent861e2ef7b0fb312c9acab325c622c7758f3eac0a
Added more insertps optimizations

Summary:
When inserting an element that's coming from a vector load or a broadcast
of a vector (or scalar) load, combine the load into the insertps
instruction.
Added PerformINSERTPSCombine for the case where we need to fix the load
(load of a vector + insertps with a non-zero CountS).
Added patterns for the broadcasts.

Also added tests for SSE4.1, AVX, and AVX2.

Reviewers: delena, nadav, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3581

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209156 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelLowering.cpp
lib/Target/X86/X86InstrSSE.td
test/CodeGen/X86/avx.ll
test/CodeGen/X86/fold-load-vec.ll
test/CodeGen/X86/sse41.ll