Avoid quadratic behavior in pathological label-alias case in MachBuffer.

Fixes #3468. If a program has many instances of the pattern "goto next; next:" in a row (i.e., no-op branches to the fallthrough address), the branch simplification in `MachBuffer` would remove them all, as expected. However, in order to work correctly, the algorithm needs to track all labels that alias the current buffer tail, so that they can be adjusted later if another branch chomp occurs. When many thousands of this branch-to-next pattern occur, many thousands of labels will reference the current buffer tail, and this list of thousands of labels will be shuffled between the branch metadata struct and the "labels at tail" struct as branches are appended and then chomped immediately. It's possible that with smarter data structure design, we could somehow share the list of labels -- e.g., a single array of all labels, in order they are bound, with ranges of indices in this array used to represent lists of labels (actually, that seems like a better design in general); but let's leave that to future optimization work. For now, we can avoid the quadratic behavior by just "giving up" if the list is too long; it's always valid to not optimize a branch. It is very unlikely that the "normal" case will have more than 100 "goto next" branches in a row, so this should not have any perf impact; if it does, we will leave 1 out of every 100 such branches un-optimized in a long sequence of thousands. This takes total compilation time down on my machine from ~300ms to ~72ms for the `foo.wasm` case in #3441. For reference, the old backend (now removed), built from arbitrarily-chosen-1-year-old commit `c7fcc344`, takes 158ms, so we're ~twice as fast, which is what I would expect.
3 years ago · 472b1b2e8a
1 changed files with 24 additions and 2 deletions
--- a/cranelift/codegen/src/machinst/buffer.rs
+++ b/cranelift/codegen/src/machinst/buffer.rs
@ -250,8 +250,12 @@ pub struct MachBufferFinalized {
    pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
 }

-static UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
-static UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
+const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
+const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
+
+/// Threshold on max length of `labels_at_this_branch` list to avoid
+/// unbounded quadratic behavior (see comment below at use-site).
+const LABEL_LIST_THRESHOLD: usize = 100;

 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at
 /// the point at which it is used by emitted code; the buffer records "fixups"
@ -791,6 +795,24 @@ impl<I: VCodeInst> MachBuffer<I> {
                break;
            }

+            // If the "labels at this branch" list on this branch is
+            // longer than a threshold, don't do any simplification,
+            // and let the branch remain to separate those labels from
+            // the current tail. This avoids quadratic behavior (see
+            // #3468): otherwise, if a long string of "goto next;
+            // next:" patterns are emitted, all of the labels will
+            // coalesce into a long list of aliases for the current
+            // buffer tail. We must track all aliases of the current
+            // tail for correctness, but we are also allowed to skip
+            // optimization (removal) of any branch, so we take the
+            // escape hatch here and let it stand. In effect this
+            // "spreads" the many thousands of labels in the
+            // pathological case among an actual (harmless but
+            // suboptimal) instruction once per N labels.
+            if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
+                break;
+            }
+
            // Invariant: we are looking at a branch that ends at the tail of
            // the buffer.