cranelift

Commit Graph

Author	SHA1	Message	Date
Alex Crichton	83029e3fb0	Add `rust-version.workspace = true` to all crates (#9112 ) Right now this is only on some crates such as `wasmtime` itself and `wasmtime-cli`, but by applying it to all crates it helps with version selection of those using just Cranelift for example.	3 months ago
Hamir Mahal	a0442ea0d9	Enforce `uninlined_format_args` for the workspace (#9065 ) * Enforce `uninlined_format_args` for the workspace * fix: failing `Monolith Checks` job * fix: formatting	3 months ago
Nick Fitzgerald	4ac1bedfca	Introduce the `pulley-interpreter` crate (#9008 ) * Introduce the `pulley-interpreter` crate This commit is the first step towards implementing https://github.com/bytecodealliance/rfcs/pull/35 This commit introduces the `pulley-interpreter` crate which contains the Pulley bytecode definition, encoder, decoder, disassembler, and interpreter. This is still very much a work in progress! It is expected that we will tweak encodings and bytecode definitions, that we will overhaul the interpreter (to, for example, optionally support the unstable Rust `explicit_tail_calls` feature), and otherwise make large changes. This is just a starting point to get the ball rolling. Subsequent commits and pull requests will do things like add the Cranelift backend to produce Pulley bytecode from Wasm as well as the runtime integration to run the Pulley interpreter inside Wasmtime. * remove stray fn main * Add small tests for special x registers * Remove now-unused import * always generate 0 pc rel offsets in arbitrary * Add doc_auto_cfg feature for docs.rs * enable all optional features for docs.rs * Consolidate `BytecodeStream::{advance,get1,get2,...}` into `BytecodeStream::read` * fix fuzz targets build * inherit workspace lints in pulley's fuzz crate * Merge fuzz targets into one target; fix a couple small fuzz bugs * Add Pulley to our cargo vet config * Add pulley as a crate to publish * Move Pulley fuzz target into top level fuzz directory	3 months ago
Nick Fitzgerald	9f66134e40	Resolve aliases before inserting values into the live set (#8945 ) * Refactor the internals of `FunctionBuilder::insert_safepoint_spills` into a few smaller methods * Initialize a logger for the `cranelift-fuzzgen` fuzz target * Resolve aliases before inserting values into the live set This fixes a fuzz bug found in the development of https://github.com/bytecodealliance/wasmtime/pull/8941	4 months ago
Nick Fitzgerald	9ffc9e67f6	Cranelift: Remove resumable traps (#8809 ) These were originally a SpiderMonkey-ism and have been unused ever since. It was introduced for GC integration, where the runtime could do something to make Cranelift code hit a trap and pause for a GC and then resume execution once GC completed. But it is unclear that, as implemented, this is actually a useful mechanism for doing that (compared to, say, loading from some Well Known page and the GC protecting that page and catching signals to interrupt the mutator, or simply branching and doing a libcall). And if someone has that particular use case in the future (Wasmtime and its GC integration doesn't need exactly this) then we can design something for what is actually needed at that time, instead of carrying this cruft forward forever.	5 months ago
Nick Fitzgerald	7a37e313d2	Add a fuzz target for exercising bounds checks with various memory configs (#8742 )	5 months ago
FrankReh	0e9121daeb	Fix some typos (#8641 ) * occurred * winch typos * tests typos * cli typos * fuzz typos * examples typos * docs typos * crates/wasmtime typos * crates/environ typos * crates/cranelift typos * crates/test-programs typos * crates/c-api typos * crates/cache typos * crates other typos * cranelift/codegen/src/isa typos * cranelift/codegen/src other typos * cranelift/codegen other typos * cranelift other typos * ci js typo * .github workflows typo * RELEASES typo * Fix clang-format documentation line --------- Co-authored-by: Andrew Brown <andrew.brown@intel.com>	6 months ago
Jamey Sharp	d1014faea9	cargo-vet: Exclude fuzzing-only dependencies (#8488 ) We can't meaningfully audit the other WebAssembly implementations that we use for differential fuzzing, such as wasmi and especially v8. Let's acknowledge that the effort to do so is not practical for us, and focus our vetting efforts on crates that developers and users are more likely to build. This reduces our estimated audit backlog by over three million lines, according to `cargo vet suggest`. Note that our crates which depend on those engines, such as wasmtime-fuzzing, are not published to crates.io, so if we fall victim to a supply chain attack against dependencies of these crates, the folks who might be impacted are limited. Although there is value in also auditing code that might be run by people who clone our git repository, in this case I propose that anyone who is concerned about the risks of supply chain attacks against their development systems should be running fuzzers inside a sandbox. After all, it's a fuzzer: it's specifically designed to try to do anything.	6 months ago
Alex Crichton	b4ecea38bc	Add a fuzzer for async wasm (#8440 ) * Add a fuzzer for async wasm This commit revives a very old branch of mine to add a fuzzer for Wasmtime in async mode. This work was originally blocked on llvm/llvm-project#53891 and while that's still an issue it now contains a workaround for that issue. Support for async fuzzing required a good deal of refactorings and changes, and the highlights are: * The main part is that new intrinsics, `__sanitizer_{start,finish}_fiber_switch` are now invoked around the stack-switching routines of fibers. This only works on Unix and is set to only compile when ASAN is enabled (otherwise everything is a noop). This required refactoring of things to get it all in just the right way for ASAN since it appears that these functions not only need to be called but more-or-less need to be adjacent to each other in the code. My guess is that while we're switching ASAN is in a "weird state" and it's not ready to run arbitrary code. * Stacks are a problem. The above issue in LLVM outlines how stacks cannot be deallocated at this time because if the deallocated virtual memory is later used for the heap then ASAN will have a false positive about stack overflow. To handle this stacks are specially handled in asan mode by using a special allocation path that never deallocates stacks. This logic additionally applies to the pooling allocator which uses a different stack allocation strategy with ASAN. With all of the above a new fuzzer is added. This fuzzer generates an arbitrary module, selects an arbitrary means of async (e.g. epochs/fuel), and then tries to execute the exports of the module with various values. In general the fuzzer is looking for crashes/panics as opposed to correct answers as there's no oracle here. This is also intended to stress the code used to switch on and off stacks. * Fix non-async build * Remove unused import * Review comments * Fix compile on MIRI * Fix Windows build	7 months ago
Nick Fitzgerald	0fa130131d	Add `GcRuntime` and `GcCompiler` traits; `i31ref` support (#8196 ) \### The `GcRuntime` and `GcCompiler` Traits This commit factors out the details of the garbage collector away from the rest of the runtime and the compiler. It does this by introducing two new traits, very similar to a subset of [those proposed in the Wasm GC RFC], although not all equivalent functionality has been added yet because Wasmtime doesn't support, for example, GC structs yet: [those proposed in the Wasm GC RFC]: https://github.com/bytecodealliance/rfcs/blob/main/accepted/wasm-gc.md#defining-the-pluggable-gc-interface 1. The `GcRuntime` trait: This trait defines how to create new GC heaps, run collections within them, and execute the various GC barriers the collector requires. Rather than monomorphize all of Wasmtime on this trait, we use it as a dynamic trait object. This does imply some virtual call overhead and missing some inlining (and resulting post-inlining) optimization opportunities. However, it is much less disruptive to the existing embedder API, results in a cleaner embedder API anyways, and we don't believe that VM runtime/embedder code is on the hot path for working with the GC at this time anyways (that would be the actual Wasm code, which has inlined GC barriers and direct calls and all of that). In the future, once we have optimized enough of the GC that such code is ever hot, we have options we can investigate at that time to avoid these dynamic virtual calls, like only enabling one single collector at build time and then creating a static type alias like `type TheOneGcImpl = ...;` based on the compile time configuration, and using this type alias in the runtime rather than a dynamic trait object. The `GcRuntime` trait additionally defines a method to reset a GC heap, for use by the pooling allocator. This allows reuse of GC heaps across different stores. This integration is very rudimentary at the moment, and is missing all kinds of configuration knobs that we should have before deploying Wasm GC in production. This commit is large enough as it is already! Ideally, in the future, I'd like to make it so that GC heaps receive their memory region, rather than allocate/reserve it themselves, and let each slot in the pooling allocator's memory pool be either a linear memory or a GC heap. This would unask various capacity planning questions such as "what percent of memory capacity should we dedicate to linear memories vs GC heaps?". It also seems like basically all the same configuration knobs we have for linear memories apply equally to GC heaps (see also the "Indexed Heaps" section below). 2. The `GcCompiler` trait: This trait defines how to emit CLIF that implements GC barriers for various operations on GC-managed references. The Rust code calls into this trait dynamically via a trait object, but since it is customizing the CLIF that is generated for Wasm code, the Wasm code itself is not making dynamic, indirect calls for GC barriers. The `GcCompiler` implementation can inline the parts of GC barrier that it believes should be inline, and leave out-of-line calls to rare slow paths. All that said, there is still only a single implementation of each of these traits: the existing deferred reference-counting (DRC) collector. So there is a bunch of code motion in this commit as the DRC collector was further isolated from the rest of the runtime and moved to its own submodule. That said, this was not purely code motion (see "Indexed Heaps" below) so it is worth not simply skipping over the DRC collector's code in review. \### Indexed Heaps This commit does bake in a couple assumptions that must be shared across all collector implementations, such as a shared `VMGcHeader` that all objects allocated within a GC heap must begin with, but the most notable and far-reaching of these assumptions is that all collectors will use "indexed heaps". What we are calling indexed heaps are basically the three following invariants: 1. All GC heaps will be a single contiguous region of memory, and all GC objects will be allocated within this region of memory. The collector may ask the system allocator for additional memory, e.g. to maintain its free lists, but GC objects themselves will never be allocated via `malloc`. 2. A pointer to a GC-managed object (i.e. a `VMGcRef`) is a 32-bit offset into the GC heap's contiguous region of memory. We never hold raw pointers to GC objects (although, of course, we have to compute them and use them temporarily when actually accessing objects). This means that deref'ing GC pointers is equivalent to deref'ing linear memory pointers: we need to add a base and we also check that the GC pointer/index is within the bounds of the GC heap. Furthermore, compressing 64-bit pointers into 32 bits is a fairly common technique among high-performance GC implementations[^compressed-oops][^v8-ptr-compression] so we are in good company. 3. Anything stored inside the GC heap is untrusted. Even each GC reference that is an element of an `(array (ref any))` is untrusted, and bounds checked on access. This means that, for example, we do not store the raw pointer to an `externref`'s host object inside the GC heap. Instead an `externref` now stores an ID that can be used to index into a side table in the store that holds the actual `Box<dyn Any>` host object, and accessing that side table is always checked. [^compressed-oops]: See ["Compressed OOPs" in OpenJDK.](https://wiki.openjdk.org/display/HotSpot/CompressedOops) [^v8-ptr-compression]: See [V8's pointer compression](https://v8.dev/blog/pointer-compression). The good news with regards to all the bounds checking that this scheme implies is that we can use all the same virtual memory tricks that linear memories use to omit explicit bounds checks. Additionally, (2) means that the sizes of GC objects is that much smaller (and therefore that much more cache friendly) because they are only holding onto 32-bit, rather than 64-bit, references to other GC objects. (We can, in the future, support GC heaps up to 16GiB in size without losing 32-bit GC pointers by taking advantage of `VMGcHeader` alignment and storing aligned indices rather than byte indices, while still leaving the bottom bit available for tagging as an `i31ref` discriminant. Should we ever need to support even larger GC heap capacities, we could go to full 64-bit references, but we would need explicit bounds checks.) The biggest benefit of indexed heaps is that, because we are (explicitly or implicitly) bounds checking GC heap accesses, and because we are not otherwise trusting any data from inside the GC heap, we greatly reduce how badly things can go wrong in the face of collector bugs and GC heap corruption. We are essentially sandboxing the GC heap region, the same way that linear memory is a sandbox. GC bugs could lead to the guest program accessing the wrong GC object, or getting garbage data from within the GC heap. But only garbage data from within the GC heap, never outside it. The worse that could happen would be if we decided not to zero out GC heaps between reuse across stores (which is a valid trade off to make, since zeroing a GC heap is a defense-in-depth technique similar to zeroing a Wasm stack and not semantically visible in the absence of GC bugs) and then a GC bug would allow the current Wasm guest to read old GC data from the old Wasm guest that previously used this GC heap. But again, it could never access host data. Taken altogether, this allows for collector implementations that are nearly free from `unsafe` code, and unsafety can otherwise be targeted and limited in scope, such as interactions with JIT code. Most importantly, we do not have to maintain critical invariants across the whole system -- invariants which can't be nicely encapsulated or abstracted -- to preserve memory safety. Such holistic invariants that refuse encapsulation are otherwise generally a huge safety problem with GC implementations. \### `VMGcRef` is NOT `Clone` or `Copy` Anymore `VMGcRef` used to be `Clone` and `Copy`. It is not anymore. The motivation here was to be sure that I was actually calling GC barriers at all the correct places. I couldn't be sure before. Now, you can still explicitly copy a raw GC reference without running GC barriers if you need to and understand why that's okay (aka you are implementing the collector), but that is something you have to opt into explicitly by calling `unchecked_copy`. The default now is that you can't just copy the reference, and instead call an explicit `clone` method (not the `Clone` trait, because we need to pass in the GC heap context to run the GC barriers) and it is hard to forget to do that accidentally. This resulted in a pretty big amount of churn, but I am wayyyyyy more confident that the correct GC barriers are called at the correct times now than I was before. \### `i31ref` I started this commit by trying to add `i31ref` support. And it grew into the whole traits interface because I found that I needed to abstract GC barriers into helpers anyways to avoid running them for `i31ref`s, so I figured that I might as well add the whole traits interface. In comparison, `i31ref` support is much easier and smaller than that other part! But it was also difficult to pull apart from this commit, sorry about that! --------------------- Overall, I know this is a very large commit. I am super happy to have some synchronous meetings to walk through this all, give an overview of the architecture, answer questions directly, etc... to make review easier! prtest:full	7 months ago
Alex Crichton	1898b8c771	Run all `.wast` tests in fuzzing (#8121 ) Run all `.wast` tests in fuzzing Currently we have a `spectest` fuzzer which uses fuzz input to generate an arbitrary configuration for Wasmtime and then executes the spec test. This ensures that no matter the configuration Wasmtime can pass spec tests. This commit expands this testing to include all `.wast` tests we have in this repository. While we don't have a ton we still have some significant ones like in #8118 which will only reproduce when turning knobs on CPU features. * Fix CLI build * Fix wast testing	8 months ago
Alex Crichton	9ce3ffe15e	Update some CI dependencies (#7983 ) * Update some CI dependencies * Update to the latest nightly toolchain * Update mdbook * Update QEMU for cross-compiled testing * Update `cargo nextest` for usage with MIRI prtest:full * Remove lots of unnecessary imports * Downgrade qemu as 8.2.1 seems to segfault * Remove more imports * Remove unused winch trait method * Fix warnings about unused trait methods * More unused imports * More unused imports	9 months ago
Jamey Sharp	caa555f8f5	cranelift: Enable "chaos mode" in egraph pass (#7968 ) First of all, thread a "chaos mode" control-plane into Context::optimize and from there into EgraphPass, OptimizeCtx, and Elaborator. In this commit we use the control-plane to change the following behaviors in ways which shouldn't cause incorrect results: - Dominator-tree block traversal order for both the rule application and elaboration passes - Order of evaluating optimization alternatives from `simplify` - Choose worst values instead of best in each eclass Co-authored-by: L. Pereira <lpereira@fastly.com>	9 months ago
Alex Crichton	353dc27389	Fully enable Winch in the `differential` fuzzer (#7932 ) This commit fully enables usage of Winch in the `differential` fuzzer against all other engines with no special cases. I attempted enabling winch for the other fuzzers as well but Winch doesn't currently implement all methods for generating various trampolines required so it's currently only limited to the `differential` fuzzer. This adds Winch as an "engine" and additionally ensures that when configured various wasm proposals are disabled that Winch doesn't support (similar to how enabling `wasmi` disables proposals that `wasmi` doesn't support). This does reduce fuzzing of Winch slightly in that the reference-types proposal is completely disabled for Winch rather than half-enabled where Winch doesn't implement `externref` operations yet but does implement `funcref` operations. This, however, enables integrating it more cleanly into the rest of the fuzzing infrastructure with fewer special cases.	9 months ago
Alex Crichton	04c03b31b7	Update the wasm-tools family of crates (#7921 ) * Update the wasm-tools family of crates Pulling in some updates to improve how WIT is managed in this repository. No changes just yet, however, just pulling in the updates first. * Fix tests * Fix fuzzer build	9 months ago
Nick Fitzgerald	15e5cf088c	Fix warnings on nightly (#7904 )	9 months ago
Saúl Cabrera	83cf7438ab	winch: Add support for WebAssembly loads/stores (#7894 ) * winch: Add support for WebAssembly loads/stores Closes https://github.com/bytecodealliance/wasmtime/issues/6529 This patch adds support for all the instructions involving WebAssembly loads and stores for 32-bit memories. Given that the `memory64` proposal is not enabled by default, this patch doesn't include an implementation/tests for it; in theory minimal tweaks to the currrent implementation will be needed in order to support 64-bit memories. Implemenation-wise, this change, follows a similar pattern as Cranelift in order to calculate addresses for dynamic/static heaps, the main difference being that in some cases, doing less work at compile time is preferred; the current implemenation only checks for the general case of out-of-bounds access for dynamic heaps for example. Another important detail regarding the implementation, is the introduction of `MacroAssembler::wasm_load` and `MacroAssembler::wasm_store`, which internally use a common implemenation for loads and stores, with the only difference that the `wasm_` variants set the right flags in order to signal that these operations are not trusted and might trap. Finally, given that this change introduces support for the last set of instructions missing for a Wasm MVP, it removes most of Winch's copy of the spectest suite, and switches over to using the official test suite where possible (for tests that don't use SIMD or Reference Types). Follow-up items: Before doing any deep benchmarking I'm planning on landing a couple of improvements regarding compile times that I've identified in parallel to this change. * The `imports.wast` tests are disabled because I've identified a bug with `call_indirect`, which is not related to this change and exists in main. * Find a way to run the `tests/all/memory.rs` (or perhaps most of integration tests) with Winch. -- prtest:full * Review comments	9 months ago
Jeffrey Charles	b546a5f257	Winch: Float conversion instructions (#7773 ) * Winch: Float conversion instructions * Add conversions suite to ignore list for Windows	10 months ago
Saúl Cabrera	446a7f5e02	winch: Multi-Value Part 2: Blocks (#7707 ) * winch: Multi-Value Part 2: Blocks This commit adds support for the Multi-Value proposal for blocks. In general, this change, introduces multiple building blocks to enable supporting arbitrary params and results in blocks: * `BlockType`: Introduce a block type, to categorize the type of each block, this makes it easier to categorize blocks per type and also makes it possible to defer the calculation of the `ABIResults` until they are actually needed rather than calculating everyghing upfront even though they might not be needed (when in an unreachable state). * Push/pop operations are now frame aware. Given that each `ControlStackFrame` contains all the information needed regarding params and results, this change moves the the implementation of the push and pop operations to the `ControlStackFrame` struct. * `StackState`: this struct holds the entry and exit invariants of each block; these invariants are pre-computed when entering the block and used throughout the code generation, to handle params, results and assert the respective invariants. In terms of the mechanics of the implementation: when entering each block, if there are results on the stack, the expected stack pointer offsets will be calculated via the `StackState`, and the `target_offset` will be used to create the block's `RetArea`. Note that when entering the block and calculating the `StackState` no space is actually reserved for the results, any space increase in the stack is deffered until the results are popped from the value stack via `ControlStackFrame::pop_abi_results`. The trickiest bit of the implementation is handling constant values that need to be placed on the right location on the machine stack. Given that constants are generally not spilled, this means that in order to keep the machine and value stack in sync (spilled-values-wise), values must be shuffled to ensure that constants are placed in the expected location results wise. See the comment in `ControlStackFrame::adjust_stack_results` for more details. * Review fixes	10 months ago
Saúl Cabrera	5708d69375	winch: Add memory instructions (#7721 ) * winch: Add memory instructions This commit adds support for the following memory instructions to winch: * `data.drop` * `memory.init` * `memory.fill` * `memory.copy` * `memory.size` * `memory.grow` In general the implementation is similar to what other instructions via builtins are hanlded (e.g. table instructions), which involve stack manipulation prior to emitting a builtin function call, with the exception of `memory.size`, which involves loading the current length from the `VMContext` * Emit right shift instead of division to obtain the memory size in pages	10 months ago
Jeffrey Charles	3b055d4776	Winch: integer conversion instructions (#7683 )	11 months ago
Saúl Cabrera	338653878d	winch: Tighten fuzzing criteria (#7621 ) This commit tightens the fuzzing criteria for Winch. The previous implementation only accounted for unsupported instructions. However, unsupported types can also cause the fuzzer to crash. Winch currently doesn't support `v128` and most of the `Ref` types.	11 months ago
Alex Crichton	5856590fae	Configure workspace lints, enable running some Clippy lints on CI (#7561 ) * Configure Rust lints at the workspace level This commit adds necessary configuration knobs to have lints configured at the workspace level in Wasmtime rather than the crate level. This uses a feature of Cargo first released with 1.74.0 (last week) of the `[workspace.lints]` table. This should help create a more consistent set of lints applied across all crates in our workspace in addition to possibly running select clippy lints on CI as well. * Move `unused_extern_crates` to the workspace level This commit configures a `deny` lint level for the `unused_extern_crates` lint to the workspace level rather than the previous configuration at the individual crate level. * Move `trivial_numeric_casts` to workspace level * Change workspace lint levels to `warn` CI will ensure that these don't get checked into the codebase and otherwise provide fewer speed bumps for in-process development. * Move `unstable_features` lint to workspace level * Move `unused_import_braces` lint to workspace level * Start running Clippy on CI This commit configures our CI to run `cargo clippy --workspace` for all merged PRs. Historically this hasn't been all the feasible due to the amount of configuration required to control the number of warnings on CI, but with Cargo's new `[lint]` table it's possible to have a one-liner to silence all lints from Clippy by default. This commit by default sets the `all` lint in Clippy to `allow` to by-default disable warnings from Clippy. The goal of this PR is to enable selective access to Clippy lints for Wasmtime on CI. * Selectively enable `clippy::cast_sign_loss` This would have fixed #7558 so try to head off future issues with that by warning against this situation in a few crates. This lint is still quite noisy though for Cranelift for example so it's not worthwhile at this time to enable it for the whole workspace. * Fix CI error prtest:full	12 months ago
Afonso Bordado	aef871b3da	fuzzgen: Allow restricting generated opcodes with env var (#7433 ) * fuzzgen: Allow restricting generated opcodes with env var * fuzzgen: Add `FUZZGEN_ALLOWED_OPS` docs	1 year ago
Jeffrey Charles	0d797f7f77	Add float comparison operators to Winch (#7379 ) * Add float comparison operators to Winch * Simplify gt and gte ops and add comments	1 year ago
Jeffrey Charles	dd42290e9a	Add support for binary float operators to Winch (#7290 )	1 year ago
Nick Fitzgerald	c16540ed2a	Update `arbitrary` to 1.3.1 (#7236 ) * Update to arbitrary 1.3.1 And use workspace dependencies for arbitrary. * Prune cargo vet's supply-chain files This is the mechanical changes made by running `cargo vet prune` which was suggested to me when I ran `cargo vet`.	1 year ago
Jeffrey Charles	654d9f5ea4	Add support for float sqrt operators to Winch (#7230 )	1 year ago
Saúl Cabrera	a109d2abe5	winch(x64): Add support for table instructions (#7155 ) * winch(x64): Add support for table instructions This change adds support for the following table insructions: `elem.drop`, `table.copy`, `table.set`, `table.get`, `table.fill`, `table.grow`, `table.size`, `table.init`. This change also introduces partial support for the `Ref` WebAssembly type, more conretely the `Func` heap type, which means that all the table instructions above, only work this WebAssembly type as of this change. Finally, this change is also a small follow up to the primitives introduced in https://github.com/bytecodealliance/wasmtime/pull/7100, more concretely: * `FnCall::with_lib`: tracks the presence of a libcall and ensures that any result registers are freed right when the call is emitted. * `MacroAssembler::table_elem_addr` returns an address rather than the value of the address, making it convenient for other use cases like `table.set`. -- prtest:full * chore: Make stack functions take impl IntoIterator<..> * Update winch/codegen/src/codegen/call.rs Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com> * Remove a dangling `dbg!` * Add comment on branching --------- Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>	1 year ago
Saúl Cabrera	4b288ba88d	winch(x64): Call indirect (#7100 ) * winch(x64): Call indirect This change adds support for the `call_indirect` instruction to Winch. Libcalls are a pre-requisite for supporting `call_indirect` in order to lazily initialy funcrefs. This change adds support for libcalls to Winch by introducing a `BuiltinFunctions` struct similar to Cranelift's `BuiltinFunctionSignatures` struct. In general, libcalls are handled like any other function call, with the only difference that given that not all the information to fulfill the function call might be known up-front, control is given to the caller for finalizing the call. The introduction of function references also involves dealing with pointer-sized loads and stores, so this change also adds the required functionality to `FuncEnv` and `MacroAssembler` to be pointer aware, making it straight forward to derive an `OperandSize` or `WasmType` from the target's pointer size. Finally, given the complexity of the call_indirect instrunction, this change bundles an improvement to the register allocator, allowing it to track the allocatable vs non-allocatable registers, this is done to avoid any mistakes when allocating/de-allocating registers that are not alloctable. -- prtest:full * Address review comments * Fix typos * Better documentation for `new_unchecked` * Introduce `max` for `BitSet` * Make allocatable property `u64` * winch(calls): Overhaul `FnCall` This commit simplifies `FnCall`'s interface making its usage more uniform throughout the compiler. In summary, this change: * Avoids side effects in the `FnCall::new` constructor, and also makes it the only constructor. * Exposes `FnCall::save_live_registers` and `FnCall::calculate_call_stack_space` to calculate the stack space consumed by the call and so that the caller can decide which one to use at callsites depending on their use-case. * tests: Fix regset tests	1 year ago
Trevor Elliott	9f00198611	winch: Support abs and neg for f32 and f64 on x64 (#6982 ) * winch: Support f32.abs and f64.abs on x64 Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> * Add an implementation of f32.neg and f64.neg * Enable spec tests for winch with f{32,64}.{neg,abs} * Enable differential fuzzing for f{32,64}.{neg,abs} for winch * Comments from code review --------- Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	1 year ago
Afonso Bordado	f10d665bb9	fuzzgen: Enable SIMD fuzzing for RISC-V (#6949 )	1 year ago
Saúl Cabrera	350410ac9c	winch: Add support for `br_table` (#6951 ) * winch: Add support for `br_table` This change adds support for the `br_table` instruction, including several modifications to the existing control flow implementation: * Improved handling of jumps to loops: Previously, the compiler erroneously treated the result of loop blocks as the definitive result of the jump. This change fixes this bug. * Streamlined result handling and stack pointer balancing: In the past, these operations were executed in two distinct steps, complicating the process of ensuring the correct invariants when emitting unconditional jumps. To simplify this, `CodeGenContext::unconditional_jump` is introduced . This function guarantees all necessary invariants are met, encapsulating the entire operation within a single function for easier understanding and maintenance. * Handling of unreachable state at the end of a function: when reaching the end of a function in an unreachable state, clear the stack and ensure that the machine stack pointer is correctly placed according to the expectations of the outermost block. In addition to the above refactoring, the main implementation of the `br_table` instruction involves emitting labels for each target. Within each label, an unconditional jump is emitted to the frame's label, ensuring correct stack pointer balancing when the jump is emitted. While it is possible to optimize this process by avoiding intermediate labels when balancing isn't required, I've opted to maintain the current implementation until such optimization becomes necessary. * chore: Rust fmt * fuzzing: Add `BrTable` to list of support instructions * docs: Improve documentation for `unconditional_jump`	1 year ago
Alex Crichton	1a11e25cff	Fix some more fuzz-test cases from pooling changes (#6943 ) * Fix some warnings on nightly Rust * Fix some more fuzz-test cases from pooling changes This commit addresses some more fallout from #6835 by updating some error messages and adding clauses for new conditions. Namely: * Module compilation is now allowed to fail when the module may have more memories/tables than the pooling allocator allows per-module. * The error message for the core instance limit being reached has been updated.	1 year ago
Saúl Cabrera	14b39bc234	winch: Initial support for floats (#6860 ) * winch: Initial support for floats This change introuduces the necessary building blocks to support floats in Winch as well as support for both `f32.const` and `f64.const` instructions. To achieve support for floats, this change adds several key enhancements to the compiler: * Constant pool: A constant pool is implemented, at the Assembler level, using the machinery exposed by Cranelift's `VCode` and `MachBuffer`. Float immediates are stored using their bit representation in the value stack, and whenever they are used at the MacroAssembler level they are added to the constant pool, from that point on, they are referenced through a `Constant` addressing mode, which gets translated to a RIP-relative addressing mode during emission. * More precise value tagging: aside from immediates, from which the type can be easily inferred, all the other value stack entries (`Memory`, `Reg`, and `Local`) are modified to explicitly contain a WebAssembly type. This allows for better instruction selection. -- prtest:full * fix: Account for relative sp position when pushing float regs This was an oversight of the initial implementation. When pushing float registers, always return an address that is relative to the current position of the stack pointer, essentially storing to (%rsp). The previous implementation accounted for static addresses, which is not correct. * fix: Introduce `stack_arg_slot_size_for_type` To correctly calculate the stack argument slot sizes, instead of overallocating for `word_bytes`, since for `f32` floating points we only need to worry about loading/storing 4 bytes. * fix: Correctly type the result register. The previous version wrongly typed the register as a general purpose register. * refactor: Re-write `add_constants` through `add_constant` * docs: Replace old comment * chore: Rust fmt * refactor: Index regset per register class This commit implements `std::ops::{Index, IndexMut}` for `RegSet` to index each of the bitsets by class. This reduces boilerplate and repetition throuhg the code generation context, register allocator and register set. * refactor: Correctly size callee saved registers To comply with the expectation of the underlying architecture: for example in Aarch64, only the low 64 bits of VRegs are callee saved (the D-view) and in the `fastcall` calling convention it's expected that the callee saves the entire 128 bits of the register xmm6-xmm15. This change also fixes the the stores/loads of callee saved float registers in the fastcall calling convention, as in the previous implementation only the low 64 bits were saved/restored. * docs: Add comment regarding typed-based spills	1 year ago
Nick Fitzgerald	5b0f2819d0	Fuzzing tweaks in wake of the pooling allocator refactor (#6873 ) * wasmtime(fuzzing): Fix a warning * wasmtime(fuzzing): Correctly configure pooling allocator in differential * wasmtime(fuzzing): Allow instantiation failures due to hitting pooling limits	1 year ago
Nick Fitzgerald	a34427a3d2	Wasmtime: refactor the pooling allocator for components (#6835 ) * Wasmtime: Rename `IndexAllocator` to `ModuleAffinityIndexAllocator` We will have multiple kinds of index allocators soon, so clarify which one this is. * Wasmtime: Introduce a simple index allocator This will be used in future commits refactoring the pooling allocator. * Wasmtime: refactor the pooling allocator for components We used to have one index allocator, an index per instance, and give out N tables and M memories to every instance regardless how many tables and memories they need. Now we have an index allocator for memories and another for tables. An instance isn't associated with a single instance, each of its memories and tables have an index. We allocate exactly as many tables and memories as the instance actually needs. Ultimately, this gives us better component support, where a component instance might have varying numbers of internal tables and memories. Additionally, you can now limit the number of tables, memories, and core instances a single component can allocate from the pooling allocator, even if there is the capacity for that many available. This is to give embedders tools to limit individual component instances and prevent them from hogging too much of the pooling allocator's resources. * Remove unused file Messed up from rebasing, this code is actually just inline in the index allocator module. * Address review feedback * Fix benchmarks build * Fix ignoring test under miri The `async_functions` module is not even compiled-but-ignored with miri, it is completely `cfg`ed off. Therefore we ahve to do the same with this test that imports stuff from that module. * Fix doc links * Allow testing utilities to be unused The exact `cfg`s that unlock the tests that use these are platform and feature dependent and ends up being like 5 things and super long. Simpler to just allow unused for when we are testing on other platforms or don't have the compile time features enabled. * Debug assert that the pool is empty on drop, per Alex's suggestion Also fix a couple scenarios where we could leak indices if allocating an index for a memory/table succeeded but then creating the memory/table itself failed. * Fix windows compile errors	1 year ago
Nick Fitzgerald	e4b8876048	Fuzzing: Check that Wasm compilation is deterministic (#6704 )	1 year ago
Saúl Cabrera	690dd116b2	winch(x64): Add support for global get and set (#6703 ) This change adds support for the `global.set` and `global.get` instructions.	1 year ago
Saúl Cabrera	3efd728480	winch(x64) Add support for local tee (#6700 ) This commit adds support for the `local.tee` instruction. This change also introduces a refactoring to the original implementation of `local.set` to be able to share most of the code for the implementation of `local.tee`.	1 year ago
Afonso Bordado	a43d1dc68f	fuzzgen: Generate Tail Calls (#6641 ) * fuzzgen: Generate Tail Calls * fuzzgen: Use the ISA's pointer type when preparing indirect calls	1 year ago
Saúl Cabrera	404711b4b5	winch(x64): Add support for return and unreachable (#6612 ) This change adds support for the `return` and `unreachable` instructions. This change builds on top of the control flow building blocks introduced in https://github.com/bytecodealliance/wasmtime/pull/6603	1 year ago
Saúl Cabrera	1bc4ff3f5d	winch(x64): Add support for `loop`, `br` and `br_if` (#6603 ) This change adds support for the `loop`, `br` and `br_if` instructions as well as unreachable code handling. Whenever an instruction that affects reachability is emitted (`br` in the case of this PR), the compiler will enter into an unreachable code state, essentially ignoring most of the subsequent instructions. When handling the unreachable code state some instructions are still observed, in order to determine if reachability should be restored. This change, particulary the handling of unreachable code, adds all the necessary building blocks to the compiler to emit other instructions that affect reachability (e.g `unreachable`, `return`). Address review feedback * Rename `branch_target` to `is_branch_target` * Use the visitor pattern to handle unreachable code Avoid string comparison and split unreachable handling functions	1 year ago
Rainy Sinclair	7513464006	Add i32.popcnt and i64.popcnt to winch (#6531 ) * Add i32.popcnt and i64.popcnt to winch Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Add fallback implementation for popcnt Move popcnt fallback up into the macroassembler. Share code between 32-bit and 64-bit popcnt Add Popcnt to winch differential fuzzing * Use _rr functions where possible * Avoid using scratch register for popcnt The scratch register was getting clobbered by the calls to `and`, so this is instead passing in a CodeGenContext to the masm's `popcnt` and letting it handle its own registers * Add filetests for the fallback popcnt impls * address PR comments * Update filetests --------- Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> Co-authored-by: Chris Fallin <chris@cfallin.org>	1 year ago
Saúl Cabrera	b4a93b6a97	winch(x64): Add support for `block` (#6554 ) * winch(x64) Add support for `block` This commit is a follow-up to https://github.com/bytecodealliance/wasmtime/pull/6550. This change implements support for blocks. * fix: Fix documentation for `resolve_block_type`	1 year ago
Saúl Cabrera	a50c49724e	winch(x64) Add support for if/else (#6550 ) * winch(x64) Add support for if/else This change adds the necessary building blocks to support control flow; this change also adds support for the `If` / `Else` operators. This change does not include multi-value support. The idea is to add support for multi-value across the compiler (functions and blocks) as a separate future change. The general gist of the change is to track the presence of control flow frames as part of the code generation context and emit the corresponding labels as and instructions as control flow blocks are found. * PR review * Allocate 64 slots for `ControlStackFrames` * Explicitly track else branches through an else entry in `ControlStackFrame`	1 year ago
Jeffrey Charles	c26a3cf66f	Add clz and ctz instructions to Winch (#6557 )	1 year ago
Jeffrey Charles	f5fafba809	Add integer binary instructions to Winch (#6538 ) * Add integer binary instructions to Winch * Use handle_invalid_operand_combination and load_constant	1 year ago
Afonso Bordado	cca5726781	fuzzgen: Fix timeout in interpreter vs interpreter mode (#6520 )	1 year ago
Jeffrey Charles	0893f7c741	Add support to Winch for i*.eqz instructions (#6508 )	1 year ago

1 2 3 4 5

244 Commits (4005a813e81e7f1423598d873fdb9a07919e54e9)