AMDGPU: Split x8 and x16 vector loads instead of scalarize
authorMatt Arsenault <Matthew.Arsenault@amd.com>
Tue, 24 Nov 2015 12:05:03 +0000 (12:05 +0000)
committerMatt Arsenault <Matthew.Arsenault@amd.com>
Tue, 24 Nov 2015 12:05:03 +0000 (12:05 +0000)
commit04abf1ee5f22e233c5779226091c1cc4e51f1c15
tree14cb1e241cf50be6aa86a48f0f161a320d0433f6
parent25203ad3b36ea05f16e49c1999940f1d872c866b
AMDGPU: Split x8 and x16 vector loads instead of scalarize

The one regression in the builtin tests is in the read2 test which now
(again) has many extra copies, but this should be solved once the pass
is replaced with a DAG combine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253974 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/AMDGPU/AMDGPUISelLowering.cpp
lib/Target/AMDGPU/SIISelLowering.cpp
test/CodeGen/AMDGPU/cvt_f32_ubyte.ll
test/CodeGen/AMDGPU/ds_read2_superreg.ll
test/CodeGen/AMDGPU/global-extload-i32.ll
test/CodeGen/AMDGPU/half.ll
test/CodeGen/AMDGPU/load.ll
test/CodeGen/AMDGPU/merge-stores.ll
test/CodeGen/AMDGPU/reorder-stores.ll
test/CodeGen/AMDGPU/salu-to-valu.ll