From 8ac4bd1d0d8228b97a88b1841cfc0247e9ef4306 Mon Sep 17 00:00:00 2001 From: Julian Seward Date: Thu, 27 Aug 2020 10:03:08 +0200 Subject: [PATCH] CL/newBE/x64: Lowering of scalar shifts: fix shift-by-imm generation The logic for generation of shifts-by-immediate was not quite right. The result was that even shifts by an amount known at compile time were being done by moving the shift immediate into %cl and then doing a variable shift by %cl. The effect is worse than it sounds, because all of those shift constants are small and often used in multiple places, so they were GVN'd up and often ended up at the entry block of the function. Hence these were connected to the use points by long live ranges which got spilled. So all in all, most of the win here comes from avoiding spilling. The problem was caused by this line, in the `Opcode::Ishl | Opcode::Ushr ..` case: ``` let (count, rhs) = if let Some(cst) = ctx.get_constant(inputs[1].insn) { ``` `inputs[]` appears to refer to this CLIF instruction's inputs, and bizarrely `inputs[].insn` all refer to the instruction (the shift) itself. Hence `ctx.get_constant(inputs[1].insn)` asks "does this shift instruction produce a constant" to which the answer is always "no", so the shift-by-unknown amount code is always generated. The fix here is to change that expression to ``` let (count, rhs) = if let Some(cst) = ctx.get_input(insn, 1).constant { ``` `get_input`'s result conveniently includes a `constant` field of type `Option`, so we just use that instead. --- cranelift/codegen/src/isa/x64/lower.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cranelift/codegen/src/isa/x64/lower.rs b/cranelift/codegen/src/isa/x64/lower.rs index c57cf40057..f3e3903c80 100644 --- a/cranelift/codegen/src/isa/x64/lower.rs +++ b/cranelift/codegen/src/isa/x64/lower.rs @@ -621,7 +621,7 @@ fn lower_insn_to_regs>( let lhs = input_to_reg(ctx, inputs[0]); - let (count, rhs) = if let Some(cst) = ctx.get_constant(inputs[1].insn) { + let (count, rhs) = if let Some(cst) = ctx.get_input(insn, 1).constant { let cst = if op == Opcode::Rotl || op == Opcode::Rotr { // Mask rotation count, according to Cranelift's semantics. (cst as u8) & (dst_ty.bits() as u8 - 1)