Document an inefficiency in tail merging.

author Dale Johannesen <dalej@apple.com>

Fri, 18 May 2007 18:46:40 +0000 (18:46 +0000)

committer Dale Johannesen <dalej@apple.com>

Fri, 18 May 2007 18:46:40 +0000 (18:46 +0000)
author Dale Johannesen <dalej@apple.com>
Fri, 18 May 2007 18:46:40 +0000 (18:46 +0000)
committer Dale Johannesen <dalej@apple.com>
Fri, 18 May 2007 18:46:40 +0000 (18:46 +0000)
diff --git a/lib/CodeGen/README.txt b/lib/CodeGen/README.txt

index 30fc27b0150ec4d1be914362103ccd8e93f2faf0..419f885feed67198f73b9161a22f7142e5a98957 100644 (file)
--- a/lib/CodeGen/README.txt
+++ b/lib/CodeGen/README.txt
@@ -142,3 +142,22 @@ load [T + 4]
  load [T + 7]
  ...
  load [T + 15]
+//===---------------------------------------------------------------------===//
+Tail merging issue:
+When we're trying to merge the tails of predecessors of a block I, and there
+are more than 2 predecessors, we don't do it optimally.  Suppose predecessors
+are A,B,C where B and C have 5 instructions in common, and A has 2 in common
+with B or C.  We want to get:
+A:
+  jmp C3
+B:
+  jmp C2
+C:
+C2:  3 common to B and C but not A
+C3:  2 common to all 3
+You get this if B and C are merged first, but currently it might randomly decide
+to merge A and B first, which results in not sharing the C2 instructions.  We 
+could look at all N*(N-1) combinations of predecessors and merge the ones with
+the most instructions in common first.  Usually that will be fast, but it 
+could get slow on big graphs (e.g. large switches tend to have blocks with many 
+predecessors).
author	Dale Johannesen <dalej@apple.com>
	Fri, 18 May 2007 18:46:40 +0000 (18:46 +0000)
committer	Dale Johannesen <dalej@apple.com>
	Fri, 18 May 2007 18:46:40 +0000 (18:46 +0000)