Remove the type legality check from the SelectionDAGBuilder when it lowers @llvm...
authorOwen Anderson <resistor@mac.com>
Sat, 23 Mar 2013 08:26:53 +0000 (08:26 +0000)
committerOwen Anderson <resistor@mac.com>
Sat, 23 Mar 2013 08:26:53 +0000 (08:26 +0000)
Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that *did* support FMA.
For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's.

NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook.  They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177820 91177308-0d34-0410-b5e6-96231b3b80d8

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
test/CodeGen/X86/wide-fma-contraction.ll [new file with mode: 0644]

index 33d100eb3a063721eccd4a82d6ea1ab8d8dfcd79..ce40cd6a0c9c6790af31c987e9ce7cdc3a87ab1f 100644 (file)
@@ -4914,7 +4914,6 @@ SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, unsigned Intrinsic) {
   case Intrinsic::fmuladd: {
     EVT VT = TLI.getValueType(I.getType());
     if (TM.Options.AllowFPOpFusion != FPOpFusion::Strict &&
-        TLI.isOperationLegalOrCustom(ISD::FMA, VT) &&
         TLI.isFMAFasterThanMulAndAdd(VT)){
       setValue(&I, DAG.getNode(ISD::FMA, dl,
                                getValue(I.getArgOperand(0)).getValueType(),
diff --git a/test/CodeGen/X86/wide-fma-contraction.ll b/test/CodeGen/X86/wide-fma-contraction.ll
new file mode 100644 (file)
index 0000000..d93f33b
--- /dev/null
@@ -0,0 +1,20 @@
+; RUN: llc -march=x86 -mattr=+fma4 -mtriple=x86_64-apple-darwin < %s | FileCheck %s
+
+; CHECK: fmafunc
+define <16 x float> @fmafunc(<16 x float> %a, <16 x float> %b, <16 x float> %c) {
+; CHECK-NOT: vmulps
+; CHECK-NOT: vaddps
+; CHECK: vfmaddps
+; CHECK-NOT: vmulps
+; CHECK-NOT: vaddps
+; CHECK: vfmaddps
+; CHECK-NOT: vmulps
+; CHECK-NOT: vaddps
+  %ret = tail call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %a, <16 x float> %b, <16 x float> %c)
+  ret <16 x float> %ret
+}
+
+declare <16 x float> @llvm.fmuladd.v16f32(<16 x float>, <16 x float>, <16 x float>) nounwind readnone
+
+
+