cranelift

Commit Graph

Author	SHA1	Message	Date
Saúl Cabrera	bac512aaac	winch: Rework `br_table` jumps (#7628 ) This commit reworks the `br_table` logic so that it correctly handles all the jumps involved to each of the targets. Even though it is safe to use the default branch for type information, it is not safe to use it to derive the base stack pointer and base value stack length. This change ensures that each target offset is taken into account to balance the value stack prior to each jump.	11 months ago
wasmtime-publish	cc816ff728	Bump Wasmtime to 17.0.0 (#7631 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	11 months ago
Saúl Cabrera	48517dee21	Revert "winch: Ensure stack pointer for br_table (#7602 )" (#7616 ) This reverts commit `8dea31c496`.	11 months ago
Saúl Cabrera	8dea31c496	winch: Ensure stack pointer for br_table (#7602 ) Follow up to: https://github.com/bytecodealliance/wasmtime/pull/7547 In which I overlooked this change and the fuzzer found an issue with the following program: ```wat (module (func (export "") (result i32) block (result i32) i32.const 0 end i32.const 0 i32.const 0 br_table 0 ) ) ``` This commit ensures that the stack pointer is correctly positioned when emitting br_table. We can't know for sure which branch will be taken, but since all branches must share the same type information, we can be certain that the expectations regarding the stack pointer are the same and thus can we use the default target in order to ensure the correct placement.	11 months ago
Jeffrey Charles	4d2244608d	Winch: cleanup stack in br_if in non-fallthrough case (#7590 ) * Winch: cleanup stack in br_if in non-fallthrough case * Remove unnecessary refetch of sp_offsets * Refactoring based on PR feedback * Have SPOffset implement Ord	11 months ago
Alex Crichton	ef07f40fe2	Update the wasm-tools family of crates (#7587 ) This commit updates to the latest wasm-tools and `wit-bindgen` to bring the family of crates forward. This update notably includes Nick's work on packed indices in the `wasmparser` crate for validation for the upcoming implementation of GC types. This meant that translation from `wasmparser` types to Wasmtime types now may work with a "type id" instead of just a type index which required plumbing not only Wasmtime's own type information but additionally `wasmparser`'s type information throughout translation. This required a fair bit of refactoring to get this working but no change in functionality is intended, only a different way of doing everything prior.	12 months ago
Jeffrey Charles	55f9a4bdcd	Winch: fix bug by spilling when emitting func call (#7573 ) * Winch: fix bug by spilling when calling a func * Forgot to commit new filetest * Only support WasmHeapType::Func * Elaborate on call_indirect jump details * Update docs for call * Verify stack is only consts and memory entries	12 months ago
Alex Crichton	5856590fae	Configure workspace lints, enable running some Clippy lints on CI (#7561 ) * Configure Rust lints at the workspace level This commit adds necessary configuration knobs to have lints configured at the workspace level in Wasmtime rather than the crate level. This uses a feature of Cargo first released with 1.74.0 (last week) of the `[workspace.lints]` table. This should help create a more consistent set of lints applied across all crates in our workspace in addition to possibly running select clippy lints on CI as well. * Move `unused_extern_crates` to the workspace level This commit configures a `deny` lint level for the `unused_extern_crates` lint to the workspace level rather than the previous configuration at the individual crate level. * Move `trivial_numeric_casts` to workspace level * Change workspace lint levels to `warn` CI will ensure that these don't get checked into the codebase and otherwise provide fewer speed bumps for in-process development. * Move `unstable_features` lint to workspace level * Move `unused_import_braces` lint to workspace level * Start running Clippy on CI This commit configures our CI to run `cargo clippy --workspace` for all merged PRs. Historically this hasn't been all the feasible due to the amount of configuration required to control the number of warnings on CI, but with Cargo's new `[lint]` table it's possible to have a one-liner to silence all lints from Clippy by default. This commit by default sets the `all` lint in Clippy to `allow` to by-default disable warnings from Clippy. The goal of this PR is to enable selective access to Clippy lints for Wasmtime on CI. * Selectively enable `clippy::cast_sign_loss` This would have fixed #7558 so try to head off future issues with that by warning against this situation in a few crates. This lint is still quite noisy though for Cranelift for example so it's not worthwhile at this time to enable it for the whole workspace. * Fix CI error prtest:full	12 months ago
Saúl Cabrera	50733725a0	winch: Solidify unreachable code handling (#7547 ) This commit solidifies the approach for unreachable code handling in control flow. Prior to this change, at unconditional jump sites, the compiler would reset the machine stack as well as the value stack. Even though this appoach might seem natural at first, it actually broke several of the invariants that must be met at the end of each contol block, this was specially noticeable with programs that conditionally entered in an unreachable state, like for example ```wat (module (func (;0;) (param i32) (result i32) local.get 0 local.get 0 if (result i32) i32.const 1 return else i32.const 2 end i32.sub ) (export "main" (func 0)) ) ``` The approach followed in this commit ensures that all the invariants are met and introduces more guardrails around those invariants. In short, instead of resetting the value stack at unconditional jump sites, the value stack handling is deferred until the reachability analysis restores the reachability of the code generation process, ensuring that the value stack contains the exact amount of values expected by the frame where reachability is restored. Given that unconditional jumps reset the machine stack, when the reachability of the code generation process is restored, the SP offset is also restored which should match the size of the value stack.	12 months ago
Saúl Cabrera	f0162a40e7	winch: Multi-Value Part 1 (#7535 ) * winch: Introduce `ABIParams` and `ABIResults` This commit prepares Winch to support WebAssembly Multi-Value. The most notorious piece of this change is the introduction of the `ABIParams` and `ABIResults` structs which are type wrappers around the concept of an `ABIOperand`, which is the underlying main representation of a param or result. This change also consolidates how the size for WebAssembly types is derived by introducing `ABI::sizeof`, as well as introducing `ABI::stack_slot_size` to concretely indicate the stack slot size in bytes for stack params, which is ABI dependent. * winch: Add the necessary ABI building blocks for multi-value This change adds the necessary changes at the ABI level in order to handle multi-value. The most notable modifications in this change are: * Modifying Winch's default ABI to reverse the order of results, ensuring that results that go in the stack should always come first; this makes it easier to respect the following two stack invariants: * Spilled memory values always precede register values * Spilled values are stored from oldest to newest, matching their respective locations on the machine stack. * Modify all calling conventions supported by Winch so that only one result, the first one is stored in registers. This differs from their vanilla counterparts in that these ABIs can handle multiple results in registers. Given that Winch is not a generic code generator, keeping the ABI close to what Wasmtime expects makes it easier to pass multiple results at trampolines. * Add more multi-value tests This commit adds more tests for multi-value and improves documentation. prtest:full * Address review feedback	12 months ago
Jeffrey Charles	321294a5d2	winch: Materialize latent locals when setting them (#7531 ) * winch: Materialize latent locals when setting them * Put method on stack and stop when encountering mem entry	12 months ago
Saúl Cabrera	9e0c650393	winch: Do not use `unconditional_jump` with `br_table` (#7525 ) This patch fixes how jumps are handled in `br_table`; prior to this change, `br_table` was implemented using `CodeGenContext::unconditional_jump`; this function ensures, among other invariants that the value stack and stack pointer must be balanced according to the expectation of the target branch. Even though in `br_table` there's branch to a potentially known location, it's impossible be certain at compile time, which branch will be taken; in that regard, `br_table` behaves more like `br_if`. Using `unconditional_jump` resulted in the stack being manipulated multiple times and breaking the other existing invariants around stack balancing. This commit makes it so that `br_table` doesn't rely on `unconditional_jump` anymore and instead it delegates control flow to the target branch, which will ensure that the value stack and stack pointer are correctly balanced when restoring reachability, very similar to what happens with `br_if`. This issue was discovered while fuzzing and a file test is included with the test case.	1 year ago
Saúl Cabrera	fced2b70cb	winch: Properly handle unconditional jumps (#7499 ) This commit improves unconditional jumps by balancing the stack pointer as well as the value stack when the current stack pointer and value stack are greater than the target stack pointer and value stack. The invariant that this changes maintains is that the the value stack should always reflect the the state of the machine stack. The value stack might have excess stack values in a presence of a fallthrough (`br_if` or `br_table`) in which the target branch is not known at compile time; in this situation instructions like `return` or `br` discard any excess values.	1 year ago
Jeffrey Charles	f1cb847c0d	Get addr of local after popping from reg (#7517 ) * Get addr of local after popping from reg * Update comment wording	1 year ago
Saúl Cabrera	b745132308	winch: Properly derive a scratch register for arg assignment (#7501 ) This commit properly derives a scratch register for a particular WebAssembly type. The included spec test uncovered that the previous implementation used a int scratch register to assign float stack arguments, which resulted in a panic.	1 year ago
Jeffrey Charles	a2d5b53062	Use scratch XMM register for spilling floats (#7494 )	1 year ago
wasmtime-publish	a32fa1b38d	Bump Wasmtime to 16.0.0 (#7482 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	1 year ago
Jeffrey Charles	0ac6e17437	Reset sp_offset when resetting stack in Winch (#7478 ) * Reset sp_offset when resetting stack in Winch * Verify we're freeing the right number of bytes	1 year ago
Saúl Cabrera	97f6a8b3e3	winch: Add tests for local_{get,set} (#7462 ) This change is a follow up to https://github.com/bytecodealliance/wasmtime/pull/7443; after it landed I realized that Winch doesn't include spec tests for local.get and loca.set. Those tests uncovered a bug on the handling of the constant pool: given Winch's singlepass nature, there's very little room know all the constants ahead of time and to register them all at once at emission time; instead they are emitted when they are needed by an instruction. Even though Cranelift's machinery is capable of deuplicated constants in the pool, `register_constant` assumes and checks that each constat should only be pushed once. In Winch's case, since we emit as we go, we need to carefully check if the constant is one was not emitted before, and if that's the case, register it. Else we break the invariant that each constant should only be registered once.	1 year ago
Jeffrey Charles	9ab2e0a65f	popcnt should check for sse4.2 support in Winch (#7449 )	1 year ago
Jeffrey Charles	db946cd51f	Fix Winch bug for funcs with params and locals (#7443 )	1 year ago
Alex Crichton	5062e3480f	Update wasm-tools crates (#7407 ) * Update wasm-tools crates This commit updates the wasm-tools family of crate for a number of notable updates: * bytecodealliance/wasm-tools#1257 - wasmparser's ID-based infrastructure has been refactored to have more precise types for each ID rather than one all-purpose `TypeId`. * bytecodealliance/wasm-tools#1262 - the implementation of "implementation imports" for the component model which both updates the binary format in addition to adding more syntactic forms of imports. * bytecodealliance/wasm-tools#1260 - a new encoding scheme for component information for `wit-component` in objects (not used by Wasmtime but used by bindings generators). Translation for components needed to be updated to account for the first change, but otherwise this was a straightforward update. * Remove a TODO	1 year ago
Jeffrey Charles	0d797f7f77	Add float comparison operators to Winch (#7379 ) * Add float comparison operators to Winch * Simplify gt and gte ops and add comments	1 year ago
Alex Crichton	962318ebea	Gate some `clap` features behind the `default` feature (#7317 ) While not a large amount of binary size if the purpose of the `--no-default-features` build is to showcase "minimal Wasmtime" then may as well try to make `clap` as small as possible.	1 year ago
Jeffrey Charles	dd42290e9a	Add support for binary float operators to Winch (#7290 )	1 year ago
Jeffrey Charles	d0b053e160	Refactor x64 asm method names in Winch (#7269 )	1 year ago
Saúl Cabrera	4f47f3ecaf	winch: Add a subset of known libcalls and improve call emission (#7228 ) * winch: Add known a subset of known libcalls and improve call emission This change is a follow up to: - https://github.com/bytecodealliance/wasmtime/pull/7155 - https://github.com/bytecodealliance/wasmtime/pull/7035 One of the objectives of this change is to make it easy to emit function calls at the MacroAssembler layer, for cases in which it's challenging to know ahead-of-time if a particular functionality can be achieved natively (e.g. rounding and SSE4.2). The original implementation of function call emission, made this objective difficult to achieve and it was also difficult to reason about. I decided to simplify the overall approach to function calls as part of this PR; in essence, the `call` module now exposes a single function `FnCall::emit` which is reponsible of gathtering the dependencies and orchestrating the emission of the call. This new approach deliberately avoids holding any state regarding the function call for simplicity. This change also standardizes the usage of `Callee` as the main entrypoint for function call emission, as of this change 4 `Callee` types exist (`Local`, `Builtin`, `Import`, `FuncRef`), each callee kind is mappable to a `CalleeKind` which is the materialized version of a callee which Cranelift understands. This change also moves the creation of the `BuiltinFunctions` to the `ISA` level given that they can be safely used accross multiple function compilations. Finally, this change also introduces support for some of the "well-known" libcalls and hooks those libcalls at the `MacroAssembler::float_round` callsite. -- prtest:full * Review comments * Remove unnecessary `into_iter` * Fix remaining lifetime parameter names	1 year ago
Jeffrey Charles	654d9f5ea4	Add support for float sqrt operators to Winch (#7230 )	1 year ago
Saúl Cabrera	a109d2abe5	winch(x64): Add support for table instructions (#7155 ) * winch(x64): Add support for table instructions This change adds support for the following table insructions: `elem.drop`, `table.copy`, `table.set`, `table.get`, `table.fill`, `table.grow`, `table.size`, `table.init`. This change also introduces partial support for the `Ref` WebAssembly type, more conretely the `Func` heap type, which means that all the table instructions above, only work this WebAssembly type as of this change. Finally, this change is also a small follow up to the primitives introduced in https://github.com/bytecodealliance/wasmtime/pull/7100, more concretely: * `FnCall::with_lib`: tracks the presence of a libcall and ensures that any result registers are freed right when the call is emitted. * `MacroAssembler::table_elem_addr` returns an address rather than the value of the address, making it convenient for other use cases like `table.set`. -- prtest:full * chore: Make stack functions take impl IntoIterator<..> * Update winch/codegen/src/codegen/call.rs Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com> * Remove a dangling `dbg!` * Add comment on branching --------- Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>	1 year ago
wasmtime-publish	157b4318df	Bump Wasmtime to 15.0.0 (#7154 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	1 year ago
Alex Crichton	4a037fc06d	Handle `lower_branch` consistently amongst backends (#7133 ) * Handle `lower_branch` consistently amongst backends This commit is a refactoring to consistently implement `lower_branch` among Cranelift's backends. Previously each backend had its own means of extracting labels and shuffling along information, and now there's prelude methods for all backends to access and use. This changes a few display impls but the actual meat of what's emitted shouldn't change amongst the backends. * Fix compile	1 year ago
Saúl Cabrera	4b288ba88d	winch(x64): Call indirect (#7100 ) * winch(x64): Call indirect This change adds support for the `call_indirect` instruction to Winch. Libcalls are a pre-requisite for supporting `call_indirect` in order to lazily initialy funcrefs. This change adds support for libcalls to Winch by introducing a `BuiltinFunctions` struct similar to Cranelift's `BuiltinFunctionSignatures` struct. In general, libcalls are handled like any other function call, with the only difference that given that not all the information to fulfill the function call might be known up-front, control is given to the caller for finalizing the call. The introduction of function references also involves dealing with pointer-sized loads and stores, so this change also adds the required functionality to `FuncEnv` and `MacroAssembler` to be pointer aware, making it straight forward to derive an `OperandSize` or `WasmType` from the target's pointer size. Finally, given the complexity of the call_indirect instrunction, this change bundles an improvement to the register allocator, allowing it to track the allocatable vs non-allocatable registers, this is done to avoid any mistakes when allocating/de-allocating registers that are not alloctable. -- prtest:full * Address review comments * Fix typos * Better documentation for `new_unchecked` * Introduce `max` for `BitSet` * Make allocatable property `u64` * winch(calls): Overhaul `FnCall` This commit simplifies `FnCall`'s interface making its usage more uniform throughout the compiler. In summary, this change: * Avoids side effects in the `FnCall::new` constructor, and also makes it the only constructor. * Exposes `FnCall::save_live_registers` and `FnCall::calculate_call_stack_space` to calculate the stack space consumed by the call and so that the caller can decide which one to use at callsites depending on their use-case. * tests: Fix regset tests	1 year ago
Trevor Elliott	230eec9eff	Add floating point rounding instructions (#7035 )	1 year ago
Ulrich Weigand	86652959a4	Refactor prolog/epilog generation code (#6970 ) This patch refactors all of the ISA/ABI specific prolog/epilog generation code around the following two ideas: 1. Separate planning of the function's frame layout from the actual implementation within prolog / epilog code. 2. No longer overload different purposes (middle-end register tracking, platform-specific details like authorization modes, and pop-stack-on-return) into a single return instruction. As to 1., the new approach is based around a FrameLayout data structure, which collects all information needed to emit prolog and epilog code, specifically the list of clobbered registers, and the sizes of all areas of the function's stack frame. This data structure is now computed once, before any code is emitted, and stored in the Callee data structure. ABIs need to implement this via a new compute_frame_layout callback, which gets all data from common code needed to make all decisions around stack layout in one place. The FrameLayout is then used going forward to answer all questions about frame sizes, and it is passed to all ABI routines involved in prolog / epilog code generation. [ This removes a lot of duplicated calculation, e.g. the list of clobbered registers is now only computed once and re-used everywhere. ] This in turn allows to reduce the number of distinct callbacks ABIs need to implement, and simplifies common code logic around how and when to call them. In particular, we now only have the following four routines, which are always called in this order: gen_prologue_frame_setup gen_clobber_save gen_clobber_restore gen_epilogue_frame_restore The main differences to before are: - frame_setup/restore are now called unconditionally (the target ABI can look in the FrameLayout to detect the case where no frame setup is required and skip whatever it thinks appropriate in that case) - there is no separate gen_prologue_start; if the target needs to do anything here, it can now just do it instead in gen_prologue_frame_setup - common code no longer attempts to emit a return instruction; instead the target can do whatever is necessary/optimal in gen_epilogue_frame_restore [ In principle we could also just have a single gen_prologue and gen_epilogue callback - I didn't implement this because then all the stack checking / probing logic would have to be moved to target code as well. ] As to 2., currently targets are required to implement a single "Ret" return instruction. This is initially used during register allocation to hold a list of return preg/vreg pairs. During epilog emission, this is replaced by another copy of the same "Ret" instruction that now carries various platform specific data (e.g. authorization modes on aarch64), and is also overloaded to handle the case where the ABI requires that a number of bytes are popped during return. This is a bit unfortunate in that it blows up the size of the instruction data, and also forces targets (that do not have a "ret N" instruction like Intel) into duplicated and possible sub-optimal implementations of stack adjustment during low-level emission of the return instruction. The new approach separates these concerns. Initially, common code emits a new "Rets" instruction that is completely parallel to the existing "Args", and is used only during register allocation holding the preg/vreg pairs. That instruction -like now- is replaced during epilog emission - but unlike now the replacement is now completely up to the target, which can do whatever it needs in gen_epilogue_frame_restore. This would typically emit some platform-specific low-level "Ret" instruction instead of the regalloc "Rets". It also allows non-Intel targets to just create a normal (or even optimized) stack adjustment sequence before its low-level "Ret". [ In particular, on riscv64 pop-stack-before-return currently emits two distinct stack adjustment instructions immediately after one another. These could now be easily merged, but that's not yet done in this patch. ] No functional change intended on any target.	1 year ago
Saúl Cabrera	1a1fc9d3c5	winch: Use `Reg` where appropriate in the Masm (#7002 ) This change is a small refactoring to some of the MacroAssembler functions to use `Reg` instead of `RegImm` where appropriate (e.g. when the operand is a destination). @elliottt pointed this out while working on https://github.com/bytecodealliance/wasmtime/pull/6982 This change also changes the signature of `float_abs` and `float_neg`, which can be simplified to take a single register.	1 year ago
Michael Chesser	2186668f52	Cranelift: Improve codegen of store_imm on x64 (#6979 ) * Improve lowering of store_imm on x64 Adds a new x64 rule for directly lowering stores of immediates with a MOV instruction. * Ensure that the MovImmM operand fits in an i32 and add tests. * Update winch to handle MovImmM change	1 year ago
Trevor Elliott	9f00198611	winch: Support abs and neg for f32 and f64 on x64 (#6982 ) * winch: Support f32.abs and f64.abs on x64 Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Add an implementation of f32.neg and f64.neg * Enable spec tests for winch with f{32,64}.{neg,abs} * Enable differential fuzzing for f{32,64}.{neg,abs} for winch * Comments from code review --------- Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	1 year ago
wasmtime-publish	e95c8556d6	Bump Wasmtime to 14.0.0 (#6964 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	1 year ago
Saúl Cabrera	350410ac9c	winch: Add support for `br_table` (#6951 ) * winch: Add support for `br_table` This change adds support for the `br_table` instruction, including several modifications to the existing control flow implementation: * Improved handling of jumps to loops: Previously, the compiler erroneously treated the result of loop blocks as the definitive result of the jump. This change fixes this bug. * Streamlined result handling and stack pointer balancing: In the past, these operations were executed in two distinct steps, complicating the process of ensuring the correct invariants when emitting unconditional jumps. To simplify this, `CodeGenContext::unconditional_jump` is introduced . This function guarantees all necessary invariants are met, encapsulating the entire operation within a single function for easier understanding and maintenance. * Handling of unreachable state at the end of a function: when reaching the end of a function in an unreachable state, clear the stack and ensure that the machine stack pointer is correctly placed according to the expectations of the outermost block. In addition to the above refactoring, the main implementation of the `br_table` instruction involves emitting labels for each target. Within each label, an unconditional jump is emitted to the frame's label, ensuring correct stack pointer balancing when the jump is emitted. While it is possible to optimize this process by avoiding intermediate labels when balancing isn't required, I've opted to maintain the current implementation until such optimization becomes necessary. * chore: Rust fmt * fuzzing: Add `BrTable` to list of support instructions * docs: Improve documentation for `unconditional_jump`	1 year ago
Christopher Serr	9ec02f9d91	Decouple `serde` from its `derive` crate (#6917 ) By not activating the `derive` feature on `serde`, the compilation speed can be improved by a lot. This is because `serde` can then compile in parallel to `serde_derive`, allowing it to finish compilation possibly even before `serde_derive`, unblocking all the crates waiting for `serde` to start compiling much sooner. As it turns out the main deciding factor for how long the compile time of a project is, is primarly determined by the depth of dependencies rather than the width. In other words, a crate's compile times aren't affected by how many crates it depends on, but rather by the longest chain of dependencies that it needs to wait on. In many cases `serde` is part of that long chain, as it is part of a long chain if the `derive` feature is active: `proc-macro2` compile build script > `proc-macro2` run build script > `proc-macro2` > `quote` > `syn` > `serde_derive` > `serde` > `serde_json` (or any crate that depends on serde) By decoupling it from `serde_derive`, the chain is shortened and compile times get much better. Check this issue for a deeper elaboration: https://github.com/serde-rs/serde/issues/2584 For `wasmtime` I'm seeing a reduction from 24.75s to 22.45s when compiling in `release` mode. This is because wasmtime through `gimli` has a dependency on `indexmap` which can only start compiling when `serde` is finished, which you want to happen as early as possible so some of wasmtime's dependencies can start compiling. To measure the full effect, the dependencies can't by themselves activate the `derive` feature. I've upstreamed a patch for `fxprof-processed-profile` which was the only dependency that activated it for `wasmtime` (not yet published to crates.io). `wasmtime-cli` and co. may need patches for their dependencies to see a similar improvement.	1 year ago
Saúl Cabrera	2da108df40	winch: Add support for parametric instructions (#6912 ) * winch: Add support for parametric instructions This commit introduces support for the drop and select instructions. Additionally, it refactors the CodeGenContext::drop_last implementation, enhancing flexibility for callers to determine the handling of elements to be dropped. This refactoring simplifies scenarios where a Memory entry is at the top of the stack. * refactor: Use `cmov` instead of local control flow	1 year ago
Saúl Cabrera	8c34599425	winch: Use type information to derive operand sizes (#6891 ) * winch: Derive `OperandSize` from the value type This change is a small refactor to how we've been handling the operand size parameter passed to some of the `CodeGenContext` operations, namely, `pop_to_reg` and `move_val_to_reg`. Given the more precise value tagging introduced in: https://github.com/bytecodealliance/wasmtime/pull/6860, it's now possible to derive the operand size from the type associated to a value stack entry, which: * Makes the usage of the functions mentioned above less error prone. * Allows a simplification of the two function definitions mentioned above. * Results in better instruction selection in some cases. * chore: Update filetests	1 year ago
Saúl Cabrera	14b39bc234	winch: Initial support for floats (#6860 ) * winch: Initial support for floats This change introuduces the necessary building blocks to support floats in Winch as well as support for both `f32.const` and `f64.const` instructions. To achieve support for floats, this change adds several key enhancements to the compiler: * Constant pool: A constant pool is implemented, at the Assembler level, using the machinery exposed by Cranelift's `VCode` and `MachBuffer`. Float immediates are stored using their bit representation in the value stack, and whenever they are used at the MacroAssembler level they are added to the constant pool, from that point on, they are referenced through a `Constant` addressing mode, which gets translated to a RIP-relative addressing mode during emission. * More precise value tagging: aside from immediates, from which the type can be easily inferred, all the other value stack entries (`Memory`, `Reg`, and `Local`) are modified to explicitly contain a WebAssembly type. This allows for better instruction selection. -- prtest:full * fix: Account for relative sp position when pushing float regs This was an oversight of the initial implementation. When pushing float registers, always return an address that is relative to the current position of the stack pointer, essentially storing to (%rsp). The previous implementation accounted for static addresses, which is not correct. * fix: Introduce `stack_arg_slot_size_for_type` To correctly calculate the stack argument slot sizes, instead of overallocating for `word_bytes`, since for `f32` floating points we only need to worry about loading/storing 4 bytes. * fix: Correctly type the result register. The previous version wrongly typed the register as a general purpose register. * refactor: Re-write `add_constants` through `add_constant` * docs: Replace old comment * chore: Rust fmt * refactor: Index regset per register class This commit implements `std::ops::{Index, IndexMut}` for `RegSet` to index each of the bitsets by class. This reduces boilerplate and repetition throuhg the code generation context, register allocator and register set. * refactor: Correctly size callee saved registers To comply with the expectation of the underlying architecture: for example in Aarch64, only the low 64 bits of VRegs are callee saved (the D-view) and in the `fastcall` calling convention it's expected that the callee saves the entire 128 bits of the register xmm6-xmm15. This change also fixes the the stores/loads of callee saved float registers in the fastcall calling convention, as in the previous implementation only the low 64 bits were saved/restored. * docs: Add comment regarding typed-based spills	1 year ago
Saúl Cabrera	d58cf09cb7	winch: Simplify the MacroAssembler and Assembler interfaces (#6841 ) This commit prepares for the introduction of float support to Winch. Initially, I intended to include this change as part of the original change supporting floats, but that change is already sizable enough. This modification simplifies the Assembler and MacroAssembler interfaces, as well as the interaction and responsibilities between them, by: * Eliminating the `Operand` abstraction, which didn't offer a substantial benefit over simply using the MacroAssembler's `RegImm` and `Address` abstractions as operands where necessary. This approach also reduces the number of conversions required prior to emission. * Shifting the instruction dispatch responsibility solely to the MacroAssembler, rather than having this responsibility shared across both abstractions. This was always the original intention behind the MacroAssembler. As a result, function definitions at the Assembler layer become simpler. This change also introduces richer type information for immediates, which results in better instruction selection in some cases, and it's also needed to support floats.	1 year ago
Alex Crichton	f32993002b	aarch64: Move AMode computation into ISLE (#6805 ) * Use `Offset32` as `i32` in ISLE This commit updates the x64 and aarch64 backends to use the `i32` primitive type in ISLE when working with an `Offset32` instead of a `u32`. This matches the intended representation of `Offset32` as a type which is signed internally and represents how offsets on instructions are often negative too. This does not actually change any end results of compilation and instead is intended to be "just" an internal refactoring with fewer casts and more consistent handling of offsets. * aarch64: Define the `PairAMode` type in ISLE This commit moves the definition of the `PairAMode` enum into ISLE instead of its current Rust-defined location. This is in preparation for the next commit where all AMode calculations will be moved into ISLE. * aarch64: Fix a copy/paste typo loading vectors This commit fixes an assertion that can be tripped in the aarch64 backend where a 64-bit load was accidentally flagged as a 128-bit load. This was found in future work which ended up tripping the assertion a bit earlier. * aarch64: Move AMode computation into ISLE This commit moves the computation of the `AMode` enum for addressing from Rust into ISLE. This enables deleting a good deal of Rust code in favor of (hopefully) more readable ISLE code. This does not mirror the preexisting logic exactly but instead takes a different approach for generating the `AMode`. Previously the entire chain of `iadd`s input into an address were unfolded into 32-bit and 64-bit operations and then those were re-combined as possible into an `AMode` (possibly emitting `add` instructions. Instead now pattern matching is used to represent this. The net result is that amodes are emitted slightly differently here and there in a number of updated test cases. I've tried to verify in all test cases that the number of instructions has not increased and the same logical operation is happening. The exact `AMode` may differ but at least instruction-wise this shouldn't be a regression. My hope is that if anything needs changing that can be represented with updates to the rule precedence in ISLE or various other special cases. One part I found a little surprising was that the immediate with a load/store instruction is not actually used much of the time. I naively thought that the mid-end optimizations would move iadd immediates into the load/store immediate but that is not the case. This necessitated two extra ISLE rules to manually peel off immediates and fold them into the load/store immediate. * aarch64: Remove `NarrowValueMode` This is no longer needed after the prior commit	1 year ago
wasmtime-publish	4c4663e2f6	Bump Wasmtime to 13.0.0 (#6809 ) Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	1 year ago
Alex Crichton	6d7bb360bd	Dependency gardening for Wasmtime (#6731 ) * Remove deny.toml exception for wasm-coredump-builder This isn't used any more so no need to continue to list this. * Update Wasmtime's pretty_env_logger dependency This removes a `deny.toml` exception for that crate, but `openvino-sys` still depends on `pretty_env_logger 0.4.0` so a new exception is added for that. * Update criterion and clap dependencies This commit started out by updating the `criterion` dependency to remove an entry in `deny.toml`, but that ended up transitively requiring a `clap` dependency upgrade from 3.x to 4.x because `criterion` uses pieces of clap 4.x. Most of this commit is then dedicated to updating clap 3.x to 4.x which was relatively simple, mostly renaming attributes here and there. * Update gimli-related dependencies I originally wanted to remove the `indexmap` clause in `deny.toml` but enough dependencies haven't updated from 1.9 to 2.0 that it wasn't possible. In the meantime though this updates some various dependencies to bring them to the latest and a few of them now use `indexmap` 2.0. * Update deps to remove `windows-sys 0.45.0` This involved updating tokio/mio and then providing new audits for new crates. The tokio exemption was updated from its old version to the new version and tokio remains un-audited. * Update `syn` to 2.x.x This required a bit of rewriting for the component-macro related bits but otherwise was pretty straightforward. The `syn` 1.x.x track is still present in the wasi-crypto tree at this time. I've additionally added some trusted audits for my own publications of `wasm-bindgen` * Update bitflags to 2.x.x This updates Wasmtime's dependency on the `bitflags` crate to the 2.x.x track to keep it up-to-date. * Update the cap-std family of crates This bumps them all to the next major version to keep up with updates. I've additionally added trusted entries for publishes of cap-std crates from Dan. There's still lingering references to rustix 0.37.x which will need to get weeded out over time. * Update memoffset dependency to latest Avoids having two versions in our crate graph. * Fix tests * Update try_from for wiggle flags * Fix build on AArch64 Linux * Enable `event` for rustix on Windows too	1 year ago
Alex Crichton	80e68c336b	Update the wasm-tools family of crates (#6710 ) * Update wasm-tools dependencies * Get tests passing after wasm-tools update Mostly dealing with updates to `wasmparser`'s API. * Update `cargo vet` for new crates * Add `equivalent`, `hashbrown`, and `quote` to the list of trusted authors. We already trust these authors for other crates. * Pull in some upstream audits for various deps. * I've audited the `pulldown-cmark` dependency upgrade myself.	1 year ago
Saúl Cabrera	690dd116b2	winch(x64): Add support for global get and set (#6703 ) This change adds support for the `global.set` and `global.get` instructions.	1 year ago
Saúl Cabrera	3efd728480	winch(x64) Add support for local tee (#6700 ) This commit adds support for the `local.tee` instruction. This change also introduces a refactoring to the original implementation of `local.set` to be able to share most of the code for the implementation of `local.tee`.	1 year ago

1 2 3 4 5

223 Commits (main)