//===---------------------------------------------------------------------===//
+Crazy idea: Consider code that uses lots of 8-bit or 16-bit values. By the
+time regalloc happens, these values are now in a 32-bit register, usually with
+the top-bits known to be sign or zero extended. If spilled, we should be able
+to spill these to a 8-bit or 16-bit stack slot, zero or sign extending as part
+of the reload.
+
+Doing this reduces the size of the stack frame (important for thumb etc), and
+also increases the likelihood that we will be able to reload multiple values
+from the stack with a single load.
+
+//===---------------------------------------------------------------------===//
+
The constant island pass is in good shape. Some cleanups might be desirable,
but there is unlikely to be much improvement in the generated code.
3. There might be some compile-time efficiency to be had by representing
consecutive islands as a single block rather than multiple blocks.
+4. Use a priority queue to sort constant pool users in inverse order of
+ position so we always process the one closed to the end of functions
+ first. This may simply CreateNewWater.
+
//===---------------------------------------------------------------------===//
We need to start generating predicated instructions. The .td files have a way
resulting live interval is not assigned a physical register. It may be
possible (with the help of the scavenger) to turn some spill / restore
pairs into register copies.
+
+//===---------------------------------------------------------------------===//
+
+More LSR enhancements possible:
+
+1. Teach LSR about pre- and post- indexed ops to allow iv increment be merged
+ in a load / store.
+2. Allow iv reuse even when a type conversion is required. For example, i8
+ and i32 load / store addressing modes are identical.
+
+
+//===---------------------------------------------------------------------===//
+
+This:
+
+int foo(int a, int b, int c, int d) {
+ long long acc = (long long)a * (long long)b;
+ acc += (long long)c * (long long)d;
+ return (int)(acc >> 32);
+}
+
+Should compile to use SMLAL (Signed Multiply Accumulate Long) which multiplies
+two signed 32-bit values to produce a 64-bit value, and accumulates this with
+a 64-bit value.
+
+We currently get this with v6:
+
+_foo:
+ mul r12, r1, r0
+ smmul r1, r1, r0
+ smmul r0, r3, r2
+ mul r3, r3, r2
+ adds r3, r3, r12
+ adc r0, r0, r1
+ bx lr
+
+and this with v4:
+
+_foo:
+ stmfd sp!, {r7, lr}
+ mov r7, sp
+ mul r12, r1, r0
+ smull r0, r1, r1, r0
+ smull lr, r0, r3, r2
+ mul r3, r3, r2
+ adds r3, r3, r12
+ adc r0, r0, r1
+ ldmfd sp!, {r7, pc}
+
+This apparently occurs in real code.
+
+//===---------------------------------------------------------------------===//