//===---------------------------------------------------------------------===// // Random ideas for the X86 backend: FP stack related stuff //===---------------------------------------------------------------------===// //===---------------------------------------------------------------------===// Some targets (e.g. athlons) prefer freep to fstp ST(0): http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html //===---------------------------------------------------------------------===// This should use fiadd on chips where it is profitable: double foo(double P, int *I) { return P+*I; } We have fiadd patterns now but the followings have the same cost and complexity. We need a way to specify the later is more profitable. def FpADD32m : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW, [(set RFP:$dst, (fadd RFP:$src1, (extloadf64f32 addr:$src2)))]>; // ST(0) = ST(0) + [mem32] def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW, [(set RFP:$dst, (fadd RFP:$src1, (X86fild addr:$src2, i32)))]>; // ST(0) = ST(0) + [mem32int] //===---------------------------------------------------------------------===// The FP stackifier should handle simple permutates to reduce number of shuffle instructions, e.g. turning: fld P -> fld Q fld Q fld P fxch or: fxch -> fucomi fucomi jl X jg X Ideas: http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html //===---------------------------------------------------------------------===// Add a target specific hook to DAG combiner to handle SINT_TO_FP and FP_TO_SINT when the source operand is already in memory. //===---------------------------------------------------------------------===// Open code rint,floor,ceil,trunc: http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html Opencode the sincos[f] libcall. //===---------------------------------------------------------------------===// None of the FPStack instructions are handled in X86RegisterInfo::foldMemoryOperand, which prevents the spiller from folding spill code into the instructions. //===---------------------------------------------------------------------===// Currently the x86 codegen isn't very good at mixing SSE and FPStack code: unsigned int foo(double x) { return x; } foo: subl $20, %esp movsd 24(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) fisttpll (%esp) movl (%esp), %eax addl $20, %esp ret This just requires being smarter when custom expanding fptoui. //===---------------------------------------------------------------------===//