[X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1...
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Mon, 6 Apr 2015 18:39:00 +0000 (18:39 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Mon, 6 Apr 2015 18:39:00 +0000 (18:39 +0000)
commit2ec72426009ce0462398fb7b370cd7ed58213db5
treea5adfcd67dad29c7545b2fc13afd39dc9daafe05
parent000ffacf53fbf332be37575cb908a0534c6a203b
[X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1 targets

This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction.

This allows folding of byte loads and reduces scalar register usage as well.

Differential Revision: http://reviews.llvm.org/D8839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234193 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/vec_cast2.ll
test/CodeGen/X86/vector-shuffle-128-v16.ll