Programming Languages Research Group: Git

author	Chandler Carruth <chandlerc@gmail.com>
	Fri, 3 Oct 2014 21:38:49 +0000 (21:38 +0000)
committer	Chandler Carruth <chandlerc@gmail.com>
	Fri, 3 Oct 2014 21:38:49 +0000 (21:38 +0000)
commit	91ea3e41ae46348d520e9cdf8123748d01b2a46a
tree	1c46a7f4385502e0f2873ed9d35b86e2f67b7b67	tree \| snapshot
parent	69ee7cb4c3a7736574587d007b8002c5aa02914e	commit \| diff

[x86] Adjust the patterns for lowering X86vzmovl nodes which don't
perform a load to use blendps rather than movss when it is available.

For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.

This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219022 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/X86/X86InstrInfo.td		diff \| blob \| history
lib/Target/X86/X86InstrSSE.td		diff \| blob \| history
test/CodeGen/X86/combine-or.ll		diff \| blob \| history
test/CodeGen/X86/sse41.ll		diff \| blob \| history
test/CodeGen/X86/vec_set-3.ll		diff \| blob \| history
test/CodeGen/X86/vector-shuffle-128-v4.ll		diff \| blob \| history
test/CodeGen/X86/vector-shuffle-256-v4.ll		diff \| blob \| history