sxth r3, r3
mla r4, r3, lr, r4
-It also increase the likelyhood the store may become dead.
-
-//===---------------------------------------------------------------------===//
-
-I think we should have a "hasSideEffects" flag (which is automatically set for
-stuff that "isLoad" "isCall" etc), and the remat pass should eventually be able
-to remat any instruction that has no side effects, if it can handle it and if
-profitable.
-
-For now, I'd suggest having the remat stuff work like this:
-
-1. I need to spill/reload this thing.
-2. Check to see if it has side effects.
-3. Check to see if it is simple enough: e.g. it only has one register
-destination and no register input.
-4. If so, clone the instruction, do the xform, etc.
-
-Advantages of this are:
-
-1. the .td file describes the behavior of the instructions, not the way the
- algorithm should work.
-2. as remat gets smarter in the future, we shouldn't have to be changing the .td
- files.
-3. it is easier to explain what the flag means in the .td file, because you
- don't have to pull in the explanation of how the current remat algo works.
-
-Some potential added complexities:
-
-1. Some instructions have to be glued to it's predecessor or successor. All of
- the PC relative instructions and condition code setting instruction. We could
- mark them as hasSideEffects, but that's not quite right. PC relative loads
- from constantpools can be remat'ed, for example. But it requires more than
- just cloning the instruction. Some instructions can be remat'ed but it
- expands to more than one instruction. But allocator will have to make a
- decision.
-
-4. As stated in 3, not as simple as cloning in some cases. The target will have
- to decide how to remat it. For example, an ARM 2-piece constant generation
- instruction is remat'ed as a load from constantpool.
+It also increase the likelihood the store may become dead.
//===---------------------------------------------------------------------===//
Use local info (i.e. register scavenger) to assign it a free register to allow
reuse:
- ldr r3, [sp, #+4]
- add r3, r3, #3
- ldr r2, [sp, #+8]
- add r2, r2, #2
- ldr r1, [sp, #+4] <==
- add r1, r1, #1
- ldr r0, [sp, #+4]
- add r0, r0, #2
+ ldr r3, [sp, #+4]
+ add r3, r3, #3
+ ldr r2, [sp, #+8]
+ add r2, r2, #2
+ ldr r1, [sp, #+4] <==
+ add r1, r1, #1
+ ldr r0, [sp, #+4]
+ add r0, r0, #2
//===---------------------------------------------------------------------===//
//===---------------------------------------------------------------------===//
-Instead of unconditionally inserting a null initializer for every GC root when
-Collector::InitRoots is set, the collector infrastructure should get a little
-bit smarter and perform a trivial DSE of the initial basic block up to the
-first safe point.
-
-//===---------------------------------------------------------------------===//
-
With a copying garbage collector, derived pointers must not be retained across
collector safe points; the collector could move the objects and invalidate the
derived pointer. This is bad enough in the first place, but safe points can
The ocaml frametable structure supports liveness information. It would be good
to support it.
+
+//===---------------------------------------------------------------------===//
+
+The FIXME in ComputeCommonTailLength in BranchFolding.cpp needs to be
+revisited. The check is there to work around a misuse of directives in inline
+assembly.
+
+//===---------------------------------------------------------------------===//
+
+It would be good to detect collector/target compatibility instead of silently
+doing the wrong thing.
+
+//===---------------------------------------------------------------------===//
+
+It would be really nice to be able to write patterns in .td files for copies,
+which would eliminate a bunch of explicit predicates on them (e.g. no side
+effects). Once this is in place, it would be even better to have tblgen
+synthesize the various copy insertion/inspection methods in TargetInstrInfo.
+
+//===---------------------------------------------------------------------===//
+
+Stack coloring improvements:
+
+1. Do proper LiveStackAnalysis on all stack objects including those which are
+ not spill slots.
+2. Reorder objects to fill in gaps between objects.
+ e.g. 4, 1, <gap>, 4, 1, 1, 1, <gap>, 4 => 4, 1, 1, 1, 1, 4, 4
+
+//===---------------------------------------------------------------------===//
+
+The scheduler should be able to sort nearby instructions by their address. For
+example, in an expanded memset sequence it's not uncommon to see code like this:
+
+ movl $0, 4(%rdi)
+ movl $0, 8(%rdi)
+ movl $0, 12(%rdi)
+ movl $0, 0(%rdi)
+
+Each of the stores is independent, and the scheduler is currently making an
+arbitrary decision about the order.
+
+//===---------------------------------------------------------------------===//
+
+Another opportunitiy in this code is that the $0 could be moved to a register:
+
+ movl $0, 4(%rdi)
+ movl $0, 8(%rdi)
+ movl $0, 12(%rdi)
+ movl $0, 0(%rdi)
+
+This would save substantial code size, especially for longer sequences like
+this. It would be easy to have a rule telling isel to avoid matching MOV32mi
+if the immediate has more than some fixed number of uses. It's more involved
+to teach the register allocator how to do late folding to recover from
+excessive register pressure.
+