[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Sat, 19 Sep 2015 13:22:57 +0000 (13:22 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Sat, 19 Sep 2015 13:22:57 +0000 (13:22 +0000)
commitafa71f40bf15a903f52a34109896adfb0b4cd873
tree9fe5b343e35bec8596dcbfaa81f2379a52c6f70f
parent02dd54df02bf8b3c18a623a27fd2cc2053b7122a
[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF

Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1))

Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x))

Differential Revision: http://reviews.llvm.org/D12663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248091 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/vector-tzcnt-128.ll
test/CodeGen/X86/vector-tzcnt-256.ll
test/CodeGen/X86/vector-tzcnt-512.ll