This should use fiadd on chips where it is profitable:
double foo(double P, int *I) { return P+*I; }
+We have fiadd patterns now but the followings have the same cost and
+complexity. We need a way to specify the later is more profitable.
+
+def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW,
+ [(set RFP:$dst, (fadd RFP:$src1,
+ (extloadf64f32 addr:$src2)))]>;
+ // ST(0) = ST(0) + [mem32]
+
+def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW,
+ [(set RFP:$dst, (fadd RFP:$src1,
+ (X86fild addr:$src2, i32)))]>;
+ // ST(0) = ST(0) + [mem32int]
+
//===---------------------------------------------------------------------===//
The FP stackifier needs to be global. Also, it should handle simple permutates
//===---------------------------------------------------------------------===//
-The x86 backend currently supports dynamic-no-pic. Need to add asm
-printer support for static and PIC.
-
-//===---------------------------------------------------------------------===//
-
We should generate bts/btr/etc instructions on targets where they are cheap or
when codesize is important. e.g., for:
//===---------------------------------------------------------------------===//
-Use fisttp to do FP to integer conversion whenever it is available.
-
-//===---------------------------------------------------------------------===//
-
Instead of the following for memset char*, 1, 10:
movl $16843009, 4(%edx)
which is probably slower, but it's interesting at least :)
+//===---------------------------------------------------------------------===//
+
+Currently the x86 codegen isn't very good at mixing SSE and FPStack
+code:
+
+unsigned int foo(double x) { return x; }
+
+foo:
+ subl $20, %esp
+ movsd 24(%esp), %xmm0
+ movsd %xmm0, 8(%esp)
+ fldl 8(%esp)
+ fisttpll (%esp)
+ movl (%esp), %eax
+ addl $20, %esp
+ ret
+
+This will be solved when we go to a dynamic programming based isel.