We have slightly different bounds checks for when Spectre mitigations are
enabled or disabled, so add a knob to our fuzzing machinery to exercise all
cases.
This commit adds support for merging a load with a `{u,s}extend` instruction. On AArch64 the load instructions already do this by default, so we can just emit the regular loads.
See also #8765 that does a similar thing for RISC-V
This test was disabled because GitHub Actions Windows Server image
doesn't have desktop experience included. But it looks like we can have
a standalone WinML binary downloaded from ONNX Runtime project.
Wasi-nn WinML backend and ONNX Runtime backend now share the same test
code as they accept the same input, and they are expected to produce the
same result.
This change also make wasi-nn WinML backend as a default feature.
prtest:full
* riscv64: Add support for `load+extend` patterns
RISC-V doesen't have sinkable loads per se, but the regular load
instructions support sign / zero extending the loaded values.
We model those here as a sinkable load on the extend instruction.
* riscv64: Clarify sinkable loads on RISC-V
This commit is a partial revert of #8609 to return `wasmtime-wasi` and
`wasmtime-wasi-http` back to using blanket impls. The main change from
before is to change the blanket impls to be in terms of a local newtype
wrapper to avoid trait coherence issues. This is done because otherwise
using the traits before required `&mut dyn WasiView` to exist but
sometimes only a `Foo<'a>` is held which is not easy to get a `&mut dyn
...` view of. By changing to a blanket impl in terms of a newtype
wrapper, `WasiImpl`, it's possible to call `bindgen!`-generated
`add_to_linker_get_host` functions with a return value of
`WasiImpl<Foo<'a>>` which enables hooking into all the generated
bindings.
This commit is a fix to Wasmtime's DWARF processing transform to correct
the meaning of the `.debug_loc` section. This section's addresses are
relative to the `DW_AT_low_pc` entry located in the
`DW_TAG_compile_unit` container, but Wasmtime's construction of this
section didn't take this into account. Instead all addresses in
`.debug_loc` are relative to the start of the compiled object, not to
the start of the compile unit itself. This commit fixes this by
unconditionally describing `DW_TAG_compile_unit` locations with
`DW_AT_ranges` instead of `DW_AT_low_pc`. This ends up fixing debug
information for debug information using `.debug_loc` with multiple
codegen units.
Closes#8752
* Make module ids unique per-process, not per-engine
Currently all instances of `wasmtime::Module` have a unique 64-bit id
embedded into them. This ID was originally only used for affinity in the
pooling allocator as a quick check of module equality. This use case
only required engine-local ids so the initial implementation had an ID
allocator per-engine.
Later, however, this id was reused for `wasmtime::ModuleExport` which
was intended to skip the string lookup for an export at runtime. This
also stored a module id but it did not store an engine identifier. This
meant that it's possible to mix these up and pass the wrong export to
the wrong engine. This behavior can lead to a runtime panic in Wasmtime.
This commit fixes this by making the module identifier be global
per-process instead of per-engine. This mirrors how store IDs are
allocated where they're per-process instead of per-engine. The same
logic for why store IDs are unlikely to be exhausted applies here too
where this 64-bit space of identifiers is unlikely to be exhausted.
* Fix warnings
* Fix tests
* Recommend `-O opt-level=0` when debugging wasm
This improves inspection of local variables by avoiding the egraph pass
which doesn't have full fidelity in terms of preserving debug
information.
* Fix example compile
* Fix some aspects of running debug tests in CI
This commit fixes the filter used to run tests in CI to include all
debug with a broader filter to include the whole `debug` module of
tests. This fixes a mistake where recent tests added weren't running in
CI.
This then additionally adds a `run-dwarf` filter in CI to run these
tests on any changes to filenames containing "debug" in the name.
* Test out the marker to run dwarf tests in CI
prtest:debug
Tracking GC references and producing stack maps is a significant amount of
complexity in `regalloc2`.
At the same time, GC reference value types are pretty annoying to deal with in
Cranelift itself. We know our `r64` is "actually" just an `i64` pointer, and we
want to do `i64`-y things with it, such as an `iadd` to compute a derived
pointer, but `iadd` only takes integer types and not `r64`s. We investigated
loosening that restriction and it was way too painful given the way that CLIF
type inference and its controlling type vars work. So to compute those derived
pointers, we have to first `bitcast` the `r64` into an `i64`. This is
unfortunate in two ways. First, because of arcane interactions between register
allocation constraints, stack maps, and ABIs this involves inserting unnecessary
register-to-register moves in our generated code which hurts binary size and
performance ever so slightly. Second, and much more seriously, this is a serious
footgun. If a GC reference isn't an `r64` right now, then it will not appear in
stack maps, and failure to record a live GC reference in a stack map means that
the collector could reclaim the object while you are still using it, leading to
use-after-free bugs! Very bad. And the mid-end needs to know
*not* to GVN these bitcasts or else we get similar bugs (see
https://github.com/bytecodealliance/wasmtime/pull/8317).
Overall GC references are a painful situation for us today.
This commit is the introduction of an alternative. (Note, though, that we aren't
quite ready to remove the old stack maps infrastructure just yet.)
Instead of preserving GC references all the way through the whole pipeline and
computing live GC references and inserting spills at safepoints for stack maps
all the way at the end of that pipeline in register allocation, the
CLIF-producing frontend explicitly generates its own stack slots and spills for
safepoints. The only thing the rest of the compiler pipeline needs to know is
the metadata required to produce the stack map for the associated safepoint. We
can completely remove `r32` and `r64` from Cranelift and just use plain `i32`
and `i64` values. Or `f64` if the runtime uses NaN-boxing, which the old stack
maps system did not support at all. Or 32-bit GC references on a 64-bit target,
which was also not supported by the old system. Furthermore, we *cannot* get
miscompiles due to GVN'ing bitcasts that shouldn't be GVN'd because there aren't
any bitcasts hiding GC references from stack maps anymore. And in the case of a
moving GC, we don't need to worry about the mid-end doing illegal code motion
across calls that could have triggered a GC that invalidated the moved GC
reference because frontends will reload their GC references from the stack slots
after the call, and that loaded value simply isn't a candidate for GVN with the
previous version. We don't have to worry about those bugs by construction.
So everything gets a lot easier under this new system.
But this commit doesn't mean we are 100% done and ready to transition to the new
system, so what is actually in here?
* CLIF producers can mark values as needing to be present in a stack map if they
are live across a safepoint in `cranelift-frontend`. This is the
`FunctionBuilder::declare_needs_stack_map` method.
* When we finalize the function we are building, we do a simple, single-pass
liveness analysis to determine the set of GC references that are live at each
safepoint, and then we insert spills to explicit stack slots just before the
safepoint. We intentionally trade away the precision of a fixed-point liveness
analysis for the speed and simplicity of a single-pass implementation.
* We annotate the safepoint with the metadata necessary to construct its
associated stack map. This is the new
`cranelift_codegen::ir::DataFlowGraph::append_user_stack_map_entry` method and
all that stuff.
* These stack map entries are part of the CLIF and can be roundtripped through
printing and parsing CLIF.
* Each stack map entry describes a GC-managed value that is on the stack and how
to locate it: its type, the stack slot it is located within, and the offset
within that stack slot where it resides. Different stack map entries for the
same safepoint may have different types or a different width from the target's
pointer.
Here is what is *not* handled yet, and left for future follow up commits:
* Lowering the stack map entries' locations from symbolic stack slot and offset
pairs to physical stack frame offsets after register allocation.
* Coalescing and aggregating the safepoints and their raw stack map entries into
a compact PC-to-stack-map table during emission.
* Supporting moving GCs. Right now we generate spills into stack slots for live
GC references just before safepoints, but we don't reload the GC references from
the stack upon their next use after the safepoint. This involves rewriting uses
of the old, spilled values which could be a little finicky, but we think we have
a good approach.
* Port Wasmtime over to using this new stack maps system.
* Removing the old stack map system, including `r{32,64}` from Cranelift and GC
reference handling from `regalloc2`. (For the time being, the new system
generally refers to "user stack maps" to disambiguate from the old system where
it might otherwise be confusing.) If we wanted to remove the old system now,
that would require us to also port Wasmtime to the new system now, and we'd end
up with a monolithic PR. Better to do this incrementally and temporarily have
the old and in-progress new system overlap for a short period of time.
Co-authored-by: Trevor Elliott <telliott@fastly.com>
This commit resolves an assert in the dwarf generating of core wasm
modules when the module has a defined linear memory which is flagged
`shared`. This is represented slightly differently in the `VMContext`
than owned memories that aren't `shared`, and looks more like an
imported memory. With support in #8740 it's now much easier to support
this.
Closes#8652
This commit is a fix to the WASIp1 adapter for components to better
handle the case where the host does not use a utf-8 string encoding.
This is never the case for `wasmtime`-the-crate since it's a Rust-based
host but this adapter is used outside of Wasmtime in jco, for example,
where JS is not utf-8-based. When transcoding from utf-16 to utf-8 hosts
may make an overlarge allocation and then shrink to a smaller
allocation. This shrinking step has never been supported by the adapter
and it's always aborted in this case.
Aside: why is this only a problem now? This hasn't been an issue before
now because jco bindings never actually shrank. In doing so however this
violated the canonical ABI because allocations are guaranteed to be
precisely sized. New debug assertions in newer versions of Rust caught
this mistake. This means that when jco tried to add downsizing of the
allocation it quickly hit this panic in the adapter.
The fix in this commit is to handle the specific case of shrinking
memory. The specific fix is to simply ignore the shrinking of memory.
This is pretty subtle though why it seems to work out well enough for
now (and it's probably still buggy). For now though this is enough to
get jco's test suite passing with a shrinking allocation.
Unfortunately I don't know of a way to test this in this repository.
Wasmtime does not support multiple encodings of host strings, only guest
strings. This means that there's no wasmtime-based way to pass a
non-utf-8 string into a guest.
This commit updates the native-DWARF processing (the `-D debug-info` CLI
flag) to support components. Previously component support was not
implemented and if there was more than one core wasm module within a
component then dwarf would be ignored entirely.
This commit contains a number of refactorings to plumb a more full
compilation context throughout the dwarf processing pipeline. Previously
the data structures used only were able to support a single module. A
new `Compilation` structure is used to represent the results of an
entire compilation and is plumbed through the various locations. Most of
the refactorings in this commit were then to extend loops to loop over
more things and handle the case where there is more than one core wasm
module.
I'll admit I'm not expert on DWARF but basic examples appear to work
locally and most of the additions here seemed relatively straightforward
in terms of "add another loop to iterate over more things" but I'm not
100% sure how well this will work. In theory this now supports
concatenating DWARF sections across multiple core wasm modules, but
that's not super well tested.
This is more-or-less a prerequisite for #8652 and extends the generated
dwarf with expressions to not only dereference owned memories but
additionally imported memories which involve some extra address
calculations to be emitted in the dwarf.
Previously this extractor would additionally match `f32const` and
`f64const` which while theoretically not the end of the world can be
confusing. Nowadays switch it to instead only matching the `iconst`
instruction, as the name implies, and if necessary matching float
constants is probably best done through separate ISLE lowering rules.
Closes#8723
This reverts part of #7863 which was a misunderstanding on my part
about `SO_REUSEADDR`. I think I mixed it up with `SO_REUSEPORT`. Without
`SO_REUSEADDR` it's possible to have this happen on Unix:
* Start a `wasmtime serve` session
* Connect to the session with `nc`
* Kill the server
* Restarting the server no longer works
Due to the active TCP connection at the time the server was killed the
socket/address is in the `TIME_WAIT` state which apparently prevents
rebinding the port until the OS has a chance to clean up everything.
Setting `SO_REUSEADDR` enables rebinding the address quickly.
Now in #7863 that was trying to fix#7852 which said that it was
possible to have multiple `wasmtime serve` instances binding the same
port. That can lead to confusion and is generally something we don't
want. That being said #7863 only fixed the issue for Windows but ended
up making Unix worse. This PR restores more reasonable behavior for both
Unix and Windows by conditionally setting the `SO_REUSEADDR` based on
the platform.
This PR additionally adds two new tests, both for rebinding an in-use
port quickly and additionally ensuring the same port can't be listened
to twice.
* Complete implementation in wasmtime
* Get the impl of IntoFunc to point at the new HostContext from_closure method to tie it back to the new implementation
* A little bit of cleanup to comments and naming
* Update doc comment for Func wrap_async
This is another take at improving the documentation for `bindgen!` in
Wasmtime. This commit takes a leaf out of the book of
bytecodealliance/wit-bindgen#871 to organize the documentation of the
macro a bit more rather than having one giant doc block that can be
difficult to explore. The macro's documentation itself is now mostly a
reference of all the options that can be specified. There is now a new
documentation-only module which serves a few purposes:
* Individual examples are organized per-submodule to be a bit more
digestable.
* Each example has an example of the generated code in addition to the
source code used for each example.
* All examples are tested on CI to compile (none are run).
My hope is that this makes it easier to expand the docs here further
over time with niche features as they arise or with various options that
the macro has. This is one of the lynchpins of Wasmtime's support for
the component model so it seems pretty important to have a good
onboarding experience here.
Along the way I've implemented a few more niche options for the
`bindgen!` macro that I found necessary, such as configuring the
`wasmtime` crate and where it's located.
Currently with the `bindgen!` macro when the `with` key is used then the
generated bindings are different than if the `with` key was not used.
Not only for replacement purposes but the generated bindings are missing
two key pieces:
* In the generated `add_to_linker` functions bounds and invocations of
`with`-overridden interfaces are all missing. This means that the
generated `add_to_linker` functions don't actually represent the full
world.
* The generated module hierarchy has "holes" for all the modules that
are overridden. While it's mostly a minor inconvenience it's also easy
enough to generate everything via `pub use` to have everything hooked
up correctly.
After this PR it means that each `bindgen!` macro should, in isolation,
work for any other `bindgen!` macro invocation. It shouldn't be
necessary to weave things together and remember how each macro was
invoked along the way. This is primarily to unblock #8715 which is
running into a case where tcp/udp are generated as sync but their
dependency, `wasi:sockets/network`, is used from an upstream async
version. The generated `add_to_linker` does not compile because tcp/udp
depend on `wasi:sockets/network` isn't added to the linker. After fixing
that it then required more modules to be generated, hence this PR.
* riscv64: Optimize storing zero to memory by using `zero_reg`
This commit is similar to #8701 in that it adds a special case to
`store` operations to use the `zero` register when applicable.
* Fix lowerings of `istore{8,16,32}`
* Add test for compressed stores
* Remove the borrow checking from `wiggle` entirely
This commit is a refactoring of the `wiggle` crate which powers the
`*.witx`-based bindings generation of Wasmtime for wasip1 support.
Originally `wiggle` had a full-blown runtime borrow checker which
verified that borrows were disjoint when appropriate. In #8277 this was
removed in favor of a more coarse "either all shared or all mutable"
guarantee. It turns out that this exactly matches what the Rust type
system guarantees at compile time as well.
This commit removes all runtime borrow checking in favor of compile-time
borrow checking instead. This means that there is no longer the
possibility of a runtime error arising from borrowing errors. Current
bindings in Wasmtime needed no restructuring to work with this new API.
The source of the refactors here are all in the `wiggle` crate. Changes
include:
* The `GuestPtr` type lost its type parameter. Additionally it only
contains a `u32` pointer now instead.
* The `GuestMemory` trait is replaced with a simple `enum` of
possibilities.
* Helper methods on `GuestPtr` are all moved to `GuestMemory`.
* A number of abstractions were simplified now that borrow checking is
no longer necessary.
* Generated trait methods now all take `&mut GuestMemory<'_>` as an
argument.
These changes were then propagated to the `wasmtime-wasi` and
`wasi-common` crates in their preview0 and preview1 implementations of
WASI. All changes are just general refactors, no functional change is
intended here.
* Review comments
* Fix publishing of wiggle-macro crate
* Fix wiggle docs
This commit updates the wit-bindgen family of crates and additionally
adds a `features` key to the `bindgen!` macro. This brings it in line
with `wit-bindgen`'s support which enables usage of the new `@since` and
`@unstable` features of WIT.
The original idea of the filtering here was to avoid passing characters
such as the NUL byte, whitespaces, or other things that could cause
issues to tools that will consume this data. Since it's impractical
to update the list of allowed characters every time a language that
compiles to Wasm generates symbols slightly differently, allow all
graphic ASCII characters instead.
(Tests don't need updating.)
Signed-off-by: L. Pereira <l.pereira@fastly.com>
* riscv64: Special-case `f32const 0` and `f64const 0`
This commit is inspired by discussion on #8695 which made me remember
the discussion around #7162 historically. In lieu of a deeper fix for
the issue of "why can't `iconst 0` use `(zero_reg)`" it's still possible
to add special-cases to rules throughout the backend so this commit does
that for generating zero-value floats.
* Fix tests
* Run all tests on CI
prtest:full
* ci: Add support for RISC-V Zicond extension
* riscv64: Add Zicond Extension and Instructions
* riscv64: Add base rules for czero instruction
* riscv64: Support swapped arguments for `czero` instruction
* riscv64: Add generic select lowering using zicond
* riscv64: Update ISA Tests
* riscv64: Fix incorrect zero swap rule
Additionally add some checks to prevent a stack overflow in the zero
swap rules
* riscv64: Gate select_xreg zero rotation rules to zicond
* riscv64: Add ZiCond select base case for all integer compares
This commit switches to add `all-features = true` for both the
`wasmtime` and `wasmtime-environ` crates on docs.rs. I occasionally look
at `wasmtime-environ` online as it can be helpful for exploration and
otherwise the docs are empty by default. Otherwise using `all-features`
for Wasmtime also includes APIs specific to Winch, for example.
* Enable rustc's `unused-lifetimes` lint
This is allow-by-default doesn't seem to have any false positives in
Wasmtime's codebase so enable it by default to help clean up vestiges of
old refactorings.
* Remove another unused lifetime
* Remove another unused lifetime
* wasmtime: Introduce the `test-macros` crate
This commit introduces the `test-macros` crate. The whole idea behind this crate is to export a single or multiple macros to make it easier to configure Wasmtime for integration tests. The main use-case at this time, is running a subset of the integration tests with Cranelift and Winch. This crate could be extended to serve other use-cases, like testing pooling allocator and/or GC configurations.
This commit introduces a single example of how this macro could be used. If there's agreement on merging this change in some shape or form, I'll follow up with migrating the current tests to use `#[wasmtime_test]` where applicable.
Part of what's implemented in this PR was discussed in Cranelift's meeting on [April 24th, 2024](https://github.com/bytecodealliance/meetings/blob/main/cranelift/2024/cranelift-04-24.md), however there are several discussion points that are still "in the air", like for example, what's the best way to avoid the combinatorial explosion problem for the potential test matrix.
* Add crate license
* Clippy fixes
* Remove test-macros from members
* Clean up test-macros Cargo.toml
- Fix version
- Add `[lints]`
- Add publish key
* Add `TestConfig` and simpify parsing
This commit adds a `TestConfig` struct that holds the supported Wasmtime
test configuration. Additonally this commit introduces a partial
function parser, in which only the attributes, visibility, and signature
are fully parsed, leaving the function body as an opaque `TokenStream`
A bunch of the hardcoded sizes were only multiples of 4k, so they'd
crash on hosts with a different page size. Fix this by making them all
based on the runtime page size instead.