Programming Languages Research Group: Git

author	Sanjay Patel <spatel@rotateright.com>
	Thu, 5 Mar 2015 21:46:54 +0000 (21:46 +0000)
committer	Sanjay Patel <spatel@rotateright.com>
	Thu, 5 Mar 2015 21:46:54 +0000 (21:46 +0000)
commit	5f79fd2f020fe5f389b73ac8b3ea99d461c1985d
tree	07773a1a515e9a4adabcabebe6d7f7488945aa78	tree \| snapshot
parent	7dc13e4d4b7ae6451587efb073116b3fe25a180b	commit \| diff

[AVX] Lower / fast-isel scalar FP selects into VBLENDV instructions (PR22483)

This patch reduces code size for all AVX targets and increases speed for some chips.

SSE 4.1 introduced the useless (see code comments) 2-register form of BLENDV and
only in the packed float/double flavors.

AVX subsequently made the instruction useful by adding a 4-register operand form.

So we just need to paper over the lack of scalar forms of this instruction, complicate
the code to choose float or double forms, and use blendv on scalars since all FP is in
xmm registers anyway.

This gives us an approximately 50% speed up for a blendv microbenchmark sequence
on SandyBridge and Haswell:
blendv : 29.73 cycles/iter
logic : 43.15 cycles/iter

No new test cases with this patch because:

1. fast-isel-select-sse.ll tests the positive side for regular X86 lowering and fast-isel
2. sse-minmax.ll and fp-select-cmp-and.ll confirm that we're not firing for scalar selects without AVX
3. fp-select-cmp-and.ll and logical-load-fold.ll confirm that we're not firing for scalar selects with constants.

http://llvm.org/bugs/show_bug.cgi?id=22483

Differential Revision: http://reviews.llvm.org/D8063

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231408 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/X86/X86FastISel.cpp		diff \| blob \| history
lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
test/CodeGen/X86/fast-isel-select-sse.ll		diff \| blob \| history