===-------------------------------------------------------------------------===
-On PPC64, this:
-
-long f2 (long x) { return 0xfffffff000000000UL; }
-
-could compile into:
-
-_f2:
- li r3,-1
- rldicr r3,r3,0,27
- blr
-
-we produce:
-
-_f2:
- lis r2, 4095
- ori r2, r2, 65535
- sldi r3, r2, 36
- blr
-
-===-------------------------------------------------------------------------===
-
This code:
unsigned add32carry(unsigned sum, unsigned x) {
===-------------------------------------------------------------------------===
-Support 'update' load/store instructions. These are cracked on the G5, but are
-still a codesize win.
-
-With preinc enabled, this:
-
-long *%test4(long *%X, long *%dest) {
- %Y = getelementptr long* %X, int 4
- %A = load long* %Y
- store long %A, long* %dest
- ret long* %Y
-}
-
-compiles to:
-
-_test4:
- mr r2, r3
- lwzu r5, 32(r2)
- lwz r3, 36(r3)
- stw r5, 0(r4)
- stw r3, 4(r4)
- mr r3, r2
- blr
-
-with -sched=list-burr, I get:
-
-_test4:
- lwz r2, 36(r3)
- lwzu r5, 32(r3)
- stw r2, 4(r4)
- stw r5, 0(r4)
- blr
-
-===-------------------------------------------------------------------------===
-
We compile the hottest inner loop of viterbi to:
li r6, 0
===-------------------------------------------------------------------------===
-No loads or stores of the constants should be needed:
-
-struct foo { double X, Y; };
-void xxx(struct foo F);
-void bar() { struct foo R = { 1.0, 2.0 }; xxx(R); }
-
-===-------------------------------------------------------------------------===
-
Darwin Stub removal:
We still generate calls to foo$stub, and stubs, on Darwin. This is not
===-------------------------------------------------------------------------===
-Compile this:
-
-int foo(int a) {
- int b = (a < 8);
- if (b) {
- return b * 3; // ignore the fact that this is always 3.
- } else {
- return 2;
- }
-}
-
-into something not this:
-
-_foo:
-1) cmpwi cr7, r3, 8
- mfcr r2, 1
- rlwinm r2, r2, 29, 31, 31
-1) cmpwi cr0, r3, 7
- bgt cr0, LBB1_2 ; UnifiedReturnBlock
-LBB1_1: ; then
- rlwinm r2, r2, 0, 31, 31
- mulli r3, r2, 3
- blr
-LBB1_2: ; UnifiedReturnBlock
- li r3, 2
- blr
-
-In particular, the two compares (marked 1) could be shared by reversing one.
-This could be done in the dag combiner, by swapping a BR_CC when a SETCC of the
-same operands (but backwards) exists. In this case, this wouldn't save us
-anything though, because the compares still wouldn't be shared.
-
-===-------------------------------------------------------------------------===
-
-We should custom expand setcc instead of pretending that we have it. That
-would allow us to expose the access of the crbit after the mfcr, allowing
-that access to be trivially folded into other ops. A simple example:
-
-int foo(int a, int b) { return (a < b) << 4; }
-
-compiles into:
-
-_foo:
- cmpw cr7, r3, r4
- mfcr r2, 1
- rlwinm r2, r2, 29, 31, 31
- slwi r3, r2, 4
- blr
-
-===-------------------------------------------------------------------------===
-
Fold add and sub with constant into non-extern, non-weak addresses so this:
static int a;
===-------------------------------------------------------------------------===
-We generate really bad code for this:
-
-int f(signed char *a, _Bool b, _Bool c) {
- signed char t = 0;
- if (b) t = *a;
- if (c) *a = t;
-}
-
-===-------------------------------------------------------------------------===
-
-This:
-int test(unsigned *P) { return *P >> 24; }
-
-Should compile to:
-
-_test:
- lbz r3,0(r3)
- blr
-
-not:
-
-_test:
- lwz r2, 0(r3)
- srwi r3, r2, 24
- blr
-
-===-------------------------------------------------------------------------===
-
-On the G5, logical CR operations are more expensive in their three
-address form: ops that read/write the same register are half as expensive as
-those that read from two registers that are different from their destination.
-
-We should model this with two separate instructions. The isel should generate
-the "two address" form of the instructions. When the register allocator
-detects that it needs to insert a copy due to the two-addresness of the CR
-logical op, it will invoke PPCInstrInfo::convertToThreeAddress. At this point
-we can convert to the "three address" instruction, to save code space.
-
-This only matters when we start generating cr logical ops.
-
-===-------------------------------------------------------------------------===
-
We should compile these two functions to the same thing:
#include <stdlib.h>
===-------------------------------------------------------------------------===
-float foo(float X) { return (int)(X); }
-
-Currently produces:
-
-_foo:
- fctiwz f0, f1
- stfd f0, -8(r1)
- lwz r2, -4(r1)
- extsw r2, r2
- std r2, -16(r1)
- lfd f0, -16(r1)
- fcfid f0, f0
- frsp f1, f0
- blr
-
-We could use a target dag combine to turn the lwz/extsw into an lwa when the
-lwz has a single use. Since LWA is cracked anyway, this would be a codesize
-win only.
-
-===-------------------------------------------------------------------------===
-
We generate ugly code for this:
void func(unsigned int *ret, float dx, float dy, float dz, float dw) {
===-------------------------------------------------------------------------===
-test/CodeGen/PowerPC/2007-03-24-cntlzd.ll compiles to:
-
-__ZNK4llvm5APInt17countLeadingZerosEv:
- ld r2, 0(r3)
- cntlzd r2, r2
- or r2, r2, r2 <<-- silly.
- addi r3, r2, -64
- blr
-
-The dead or is a 'truncate' from 64- to 32-bits.
-
-===-------------------------------------------------------------------------===
-
We generate horrible ppc code for this:
#define N 2000000