* Move wast tests to their own test suite
This commit moves testing of `*.wast` files out of the `all` test suite
binary and into its own separate binary. The motivation for this is
well-described in #4861 with one of the chief reasons being that if the
test suite is run and then a new file is added re-running the test suite
won't see the file.
The `libtest-mimic` crate provides an easy way of regaining most of the
features of the `libtest` harness such as parallel test execution and
filters, meaning that it's pretty easy to switch everything over. The
only slightly-tricky bit was redoing the filter for whether a test is
ignored or not, but most of the pieces were copied over from the
previous `build.rs` logic.
Closes#4861
* Fix the `all` suite
* Review comments
These are all passing now that we support typed function references in the
embedder API and our `wasmparser` and `wast` deps have been updated to versions
that fix the issues referenced in the old comments.
\### The `GcRuntime` and `GcCompiler` Traits
This commit factors out the details of the garbage collector away from the rest
of the runtime and the compiler. It does this by introducing two new traits,
very similar to a subset of [those proposed in the Wasm GC RFC], although not
all equivalent functionality has been added yet because Wasmtime doesn't
support, for example, GC structs yet:
[those proposed in the Wasm GC RFC]: https://github.com/bytecodealliance/rfcs/blob/main/accepted/wasm-gc.md#defining-the-pluggable-gc-interface
1. The `GcRuntime` trait: This trait defines how to create new GC heaps, run
collections within them, and execute the various GC barriers the collector
requires.
Rather than monomorphize all of Wasmtime on this trait, we use it
as a dynamic trait object. This does imply some virtual call overhead and
missing some inlining (and resulting post-inlining) optimization
opportunities. However, it is *much* less disruptive to the existing embedder
API, results in a cleaner embedder API anyways, and we don't believe that VM
runtime/embedder code is on the hot path for working with the GC at this time
anyways (that would be the actual Wasm code, which has inlined GC barriers
and direct calls and all of that). In the future, once we have optimized
enough of the GC that such code is ever hot, we have options we can
investigate at that time to avoid these dynamic virtual calls, like only
enabling one single collector at build time and then creating a static type
alias like `type TheOneGcImpl = ...;` based on the compile time
configuration, and using this type alias in the runtime rather than a dynamic
trait object.
The `GcRuntime` trait additionally defines a method to reset a GC heap, for
use by the pooling allocator. This allows reuse of GC heaps across different
stores. This integration is very rudimentary at the moment, and is missing
all kinds of configuration knobs that we should have before deploying Wasm GC
in production. This commit is large enough as it is already! Ideally, in the
future, I'd like to make it so that GC heaps receive their memory region,
rather than allocate/reserve it themselves, and let each slot in the pooling
allocator's memory pool be *either* a linear memory or a GC heap. This would
unask various capacity planning questions such as "what percent of memory
capacity should we dedicate to linear memories vs GC heaps?". It also seems
like basically all the same configuration knobs we have for linear memories
apply equally to GC heaps (see also the "Indexed Heaps" section below).
2. The `GcCompiler` trait: This trait defines how to emit CLIF that implements
GC barriers for various operations on GC-managed references. The Rust code
calls into this trait dynamically via a trait object, but since it is
customizing the CLIF that is generated for Wasm code, the Wasm code itself is
not making dynamic, indirect calls for GC barriers. The `GcCompiler`
implementation can inline the parts of GC barrier that it believes should be
inline, and leave out-of-line calls to rare slow paths.
All that said, there is still only a single implementation of each of these
traits: the existing deferred reference-counting (DRC) collector. So there is a
bunch of code motion in this commit as the DRC collector was further isolated
from the rest of the runtime and moved to its own submodule. That said, this was
not *purely* code motion (see "Indexed Heaps" below) so it is worth not simply
skipping over the DRC collector's code in review.
\### Indexed Heaps
This commit does bake in a couple assumptions that must be shared across all
collector implementations, such as a shared `VMGcHeader` that all objects
allocated within a GC heap must begin with, but the most notable and
far-reaching of these assumptions is that all collectors will use "indexed
heaps".
What we are calling indexed heaps are basically the three following invariants:
1. All GC heaps will be a single contiguous region of memory, and all GC objects
will be allocated within this region of memory. The collector may ask the
system allocator for additional memory, e.g. to maintain its free lists, but
GC objects themselves will never be allocated via `malloc`.
2. A pointer to a GC-managed object (i.e. a `VMGcRef`) is a 32-bit offset into
the GC heap's contiguous region of memory. We never hold raw pointers to GC
objects (although, of course, we have to compute them and use them
temporarily when actually accessing objects). This means that deref'ing GC
pointers is equivalent to deref'ing linear memory pointers: we need to add a
base and we also check that the GC pointer/index is within the bounds of the
GC heap. Furthermore, compressing 64-bit pointers into 32 bits is a fairly
common technique among high-performance GC
implementations[^compressed-oops][^v8-ptr-compression] so we are in good
company.
3. Anything stored inside the GC heap is untrusted. Even each GC reference that
is an element of an `(array (ref any))` is untrusted, and bounds checked on
access. This means that, for example, we do not store the raw pointer to an
`externref`'s host object inside the GC heap. Instead an `externref` now
stores an ID that can be used to index into a side table in the store that
holds the actual `Box<dyn Any>` host object, and accessing that side table is
always checked.
[^compressed-oops]: See ["Compressed OOPs" in
OpenJDK.](https://wiki.openjdk.org/display/HotSpot/CompressedOops)
[^v8-ptr-compression]: See [V8's pointer
compression](https://v8.dev/blog/pointer-compression).
The good news with regards to all the bounds checking that this scheme implies
is that we can use all the same virtual memory tricks that linear memories use
to omit explicit bounds checks. Additionally, (2) means that the sizes of GC
objects is that much smaller (and therefore that much more cache friendly)
because they are only holding onto 32-bit, rather than 64-bit, references to
other GC objects. (We can, in the future, support GC heaps up to 16GiB in size
without losing 32-bit GC pointers by taking advantage of `VMGcHeader` alignment
and storing aligned indices rather than byte indices, while still leaving the
bottom bit available for tagging as an `i31ref` discriminant. Should we ever
need to support even larger GC heap capacities, we could go to full 64-bit
references, but we would need explicit bounds checks.)
The biggest benefit of indexed heaps is that, because we are (explicitly or
implicitly) bounds checking GC heap accesses, and because we are not otherwise
trusting any data from inside the GC heap, we greatly reduce how badly things
can go wrong in the face of collector bugs and GC heap corruption. We are
essentially sandboxing the GC heap region, the same way that linear memory is a
sandbox. GC bugs could lead to the guest program accessing the wrong GC object,
or getting garbage data from within the GC heap. But only garbage data from
within the GC heap, never outside it. The worse that could happen would be if we
decided not to zero out GC heaps between reuse across stores (which is a valid
trade off to make, since zeroing a GC heap is a defense-in-depth technique
similar to zeroing a Wasm stack and not semantically visible in the absence of
GC bugs) and then a GC bug would allow the current Wasm guest to read old GC
data from the old Wasm guest that previously used this GC heap. But again, it
could never access host data.
Taken altogether, this allows for collector implementations that are nearly free
from `unsafe` code, and unsafety can otherwise be targeted and limited in scope,
such as interactions with JIT code. Most importantly, we do not have to maintain
critical invariants across the whole system -- invariants which can't be nicely
encapsulated or abstracted -- to preserve memory safety. Such holistic
invariants that refuse encapsulation are otherwise generally a huge safety
problem with GC implementations.
\### `VMGcRef` is *NOT* `Clone` or `Copy` Anymore
`VMGcRef` used to be `Clone` and `Copy`. It is not anymore. The motivation here
was to be sure that I was actually calling GC barriers at all the correct
places. I couldn't be sure before. Now, you can still explicitly copy a raw GC
reference without running GC barriers if you need to and understand why that's
okay (aka you are implementing the collector), but that is something you have to
opt into explicitly by calling `unchecked_copy`. The default now is that you
can't just copy the reference, and instead call an explicit `clone` method (not
*the* `Clone` trait, because we need to pass in the GC heap context to run the
GC barriers) and it is hard to forget to do that accidentally. This resulted in
a pretty big amount of churn, but I am wayyyyyy more confident that the correct
GC barriers are called at the correct times now than I was before.
\### `i31ref`
I started this commit by trying to add `i31ref` support. And it grew into the
whole traits interface because I found that I needed to abstract GC barriers
into helpers anyways to avoid running them for `i31ref`s, so I figured that I
might as well add the whole traits interface. In comparison, `i31ref` support is
much easier and smaller than that other part! But it was also difficult to pull
apart from this commit, sorry about that!
---------------------
Overall, I know this is a very large commit. I am super happy to have some
synchronous meetings to walk through this all, give an overview of the
architecture, answer questions directly, etc... to make review easier!
prtest:full
* winch: Overhaul the internal ABI
This change overhauls Winch's ABI. This means that as part of this change, the default ABI now closely resembles Cranelift's ABI, particularly on the treatment of the VMContext. This change also fixes many wrong assumptions about trampolines, which are tied to how the previous ABI operated.
The main motivation behind this change is:
* To make it easier to integrate Winch-generated functions with Wasmtime
* Fix fuzz bugs related to imports
* Solidify the implementation regarding the usage of a pinned register to hold the VMContext value throughout the lifetime of a function.
The previous implementation had the following characteristics, and wrong assumptions):
* Assumed that nternal functions don't receive a caller or callee VMContexts as parameters.
* Worked correctly in the following scenarios:
* `Wasm -> Native`: since we can explicitly load the caller and callee `VMContext`, because we're
calling a native import.
* `(Native, Array) -> Wasm`: because the native signatures define a tuple of `VMContext` as arguments.
* It didn't work in the following scenario:
* `Wasm->Wasm`: When calling imports from another WebAssembly instance (via
direct call or `call_indirect`. The previous implementation wrongly assumes
that there should be a trampoline in this case, but there isn't. The code
was generated by the same compiler, so the same ABI should be used in
both functions, but it doesn't.
This change introduces the following changes, which fix the previous assumptions and bugs:
* All internal functions declare a two extra pointer-sized parameters, which will hold the callee and caller `VMContext`s
* Use a pinned register that will be considered live through the lifetime of the function instead of pinning it at the trampoline level. The pinning explicitlly happens when entering the function body and no other assumptions are made from there on.
* Introduce the concept of special `ContextArgs` for function calls. This enum holds metadata about which context arguments are needed depending on the callee. The previous implementation of introducing register values at arbitrary locations in the value stack conflicts with the stack ordering principle which states that older values must *always* precede newer values. So we can't insert a register, because if a spill happens the order of the values will be wrong.
Finally, given that this change also enables the `imports.wast` test suite, it also includes a fix to `global.{get, set}` instructions which didn't account entirely for imported globals.
Resolved conflicts
Update Winch filetests
* Fix typos
* Use `get_wasm_local` and `get_frame_local` instead of `get_local` and `get_local_unchecked`
* Introduce `MAX_CONTEXT_ARGS` and use it in the trampoline to skip context arguments.
* winch: Add support for WebAssembly loads/stores
Closes https://github.com/bytecodealliance/wasmtime/issues/6529
This patch adds support for all the instructions involving WebAssembly
loads and stores for 32-bit memories. Given that the `memory64` proposal
is not enabled by default, this patch doesn't include an
implementation/tests for it; in theory minimal tweaks to the
currrent implementation will be needed in order to support 64-bit
memories.
Implemenation-wise, this change, follows a similar pattern as Cranelift
in order to calculate addresses for dynamic/static heaps, the main
difference being that in some cases, doing less work at compile time is
preferred; the current implemenation only checks for the general case of
out-of-bounds access for dynamic heaps for example.
Another important detail regarding the implementation, is the
introduction of `MacroAssembler::wasm_load` and
`MacroAssembler::wasm_store`, which internally use a common
implemenation for loads and stores, with the only difference that the
`wasm_*` variants set the right flags in order to signal that these
operations are not trusted and might trap.
Finally, given that this change introduces support for the last set of
instructions missing for a Wasm MVP, it removes most of Winch's copy of
the spectest suite, and switches over to using the official test suite
where possible (for tests that don't use SIMD or Reference Types).
Follow-up items:
* Before doing any deep benchmarking I'm planning on landing a couple of
improvements regarding compile times that I've identified in parallel
to this change.
* The `imports.wast` tests are disabled because I've identified a bug
with `call_indirect`, which is not related to this change and exists
in main.
* Find a way to run the `tests/all/memory.rs` (or perhaps most of
integration tests) with Winch.
--
prtest:full
* Review comments
* Enable all winch tests on windows
prtest:mingw-x64
* Plumb through x64 unwind info creation
* Add the frame regs unwind info
* Emit UnwindInfo::SaveReg instructions
* Review feedback
* Comment the offset_downward_to_clobbers value
* Add stack overflow tests
* Add stack overflow tests for indirect calls
* Check for stack overflow on function entry
* Ignore the call tests on windows, as stack overflows trap
* Bless the winch filetests
* winch: Add memory instructions
This commit adds support for the following memory instructions to winch:
* `data.drop`
* `memory.init`
* `memory.fill`
* `memory.copy`
* `memory.size`
* `memory.grow`
In general the implementation is similar to what other instructions via
builtins are hanlded (e.g. table instructions), which involve stack
manipulation prior to emitting a builtin function call, with the
exception of `memory.size`, which involves loading the current length
from the `VMContext`
* Emit right shift instead of division to obtain the memory size in pages
This commit adds some more information to `wasmtime --version` which
includes the git commit plus the git commit's date. This matches `rustc
-V` for example which was additionally copied to `wasm-tools` and
mirrored as `wasm-tools -V`.
Personally I've found this useful since it can help point to exact
commits and additionally quickly get a sense of how old a version is
based on its commit date presented.
* winch(x64): Add support for table instructions
This change adds support for the following table insructions:
`elem.drop`, `table.copy`, `table.set`, `table.get`, `table.fill`,
`table.grow`, `table.size`, `table.init`.
This change also introduces partial support for the `Ref` WebAssembly
type, more conretely the `Func` heap type, which means that all the
table instructions above, only work this WebAssembly type as of this
change.
Finally, this change is also a small follow up to the primitives
introduced in https://github.com/bytecodealliance/wasmtime/pull/7100,
more concretely:
* `FnCall::with_lib`: tracks the presence of a libcall and ensures that
any result registers are freed right when the call is emitted.
* `MacroAssembler::table_elem_addr` returns an address rather than the
value of the address, making it convenient for other use cases like
`table.set`.
--
prtest:full
* chore: Make stack functions take impl IntoIterator<..>
* Update winch/codegen/src/codegen/call.rs
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* Remove a dangling `dbg!`
* Add comment on branching
---------
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* winch(x64): Call indirect
This change adds support for the `call_indirect` instruction to Winch.
Libcalls are a pre-requisite for supporting `call_indirect` in order to
lazily initialy funcrefs. This change adds support for libcalls to
Winch by introducing a `BuiltinFunctions` struct similar to Cranelift's
`BuiltinFunctionSignatures` struct.
In general, libcalls are handled like any other function call, with the
only difference that given that not all the information to fulfill the
function call might be known up-front, control is given to the caller
for finalizing the call.
The introduction of function references also involves dealing with
pointer-sized loads and stores, so this change also adds the required
functionality to `FuncEnv` and `MacroAssembler` to be pointer aware,
making it straight forward to derive an `OperandSize` or `WasmType` from
the target's pointer size.
Finally, given the complexity of the call_indirect instrunction, this
change bundles an improvement to the register allocator, allowing it to
track the allocatable vs non-allocatable registers, this is done to
avoid any mistakes when allocating/de-allocating registers that are not
alloctable.
--
prtest:full
* Address review comments
* Fix typos
* Better documentation for `new_unchecked`
* Introduce `max` for `BitSet`
* Make allocatable property `u64`
* winch(calls): Overhaul `FnCall`
This commit simplifies `FnCall`'s interface making its usage more
uniform throughout the compiler. In summary, this change:
* Avoids side effects in the `FnCall::new` constructor, and also makes
it the only constructor.
* Exposes `FnCall::save_live_registers` and
`FnCall::calculate_call_stack_space` to calculate the stack space
consumed by the call and so that the caller can decide which one to
use at callsites depending on their use-case.
* tests: Fix regset tests
Due to a bug in the qemu emulation of the corresponding instruction,
tests for pseudo min/max support had been disabled. After moving
to qemu 8.0.4, where the bug is fixed, these can be re-enabled.
This commit optimizes the use of CI resources by avoiding unnecessary
duplication of WebAssembly spec tests when using the Cranelift compiler
strategy. Previously, Cranelift was tested against both the official spec test
suite and Winch's test suite, the latter being a subset of the former. This
commit eliminates this redundancy.
* Wasmtime: Add support for Wasm tail calls
This adds the `Config::wasm_tail_call` method and `--wasm-features tail-call`
CLI flag to enable the Wasm tail calls proposal in Wasmtime.
This PR is mostly just plumbing and enabling tests, since all the prerequisite
work (Wasmtime trampoline overhauls and Cranelift tail calls) was completed in
earlier pull requests.
When Wasm tail calls are enabled, Wasm code uses the `tail` calling
convention. The `tail` calling convention is known to cause a 1-7% slow down for
regular code that isn't using tail calls, which is why it isn't used
unconditionally. This involved shepherding `Tunables` through to Wasm signature
construction methods. The eventual plan is for the `tail` calling convention to
be used unconditionally, but not until the performance regression is
addressed. This work is tracked in
https://github.com/bytecodealliance/wasmtime/issues/6759
Additionally while our x86-64, aarch64, and riscv64 backends support tail calls,
the s390x backend does not support them yet. Attempts to use tail calls on s390x
will return errors. Support for s390x is tracked in
https://github.com/bytecodealliance/wasmtime/issues/6530
* Store `Tunables` inside the `Compiler`
Instead of passing as an argument to every `Compiler` method.
* Cranelift: Support "direct" return calls on riscv64
They still use `jalr` instead of `jal` but this allows us to use the `RiscvCall`
reloc, which Wasmtime handles. Before we were using `LoadExternalName` which
produces an `Abs8` reloc, which Wasmtime intentionally does not handle since
that involves patching code at runtime, which makes loading code slower.
* Fix tests that assume tail call support on s390x
These are implemented as a combination of two steps, mask generation and
mask expansion. Our comparision rules only return their results as a mask
register, so we need to expand the mask into lane sized elements.
We have 20 (!) comparision instructions, nearly the full table of all IntCC codes
in VV, VX and VI formats. However there are some holes in this table.
They are:
* `vmsltu.vi`
* `vmslt.vi`
* `vmsgtu.vv`
* `vmsgt.vv`
* `vmsgeu.*`
* `vmsge.*`
Most of these can be replaces with the inverted IntCC instruction, however
this commit only implements the existing instructions without any inversion
and the inverted VV versions of `sgtu`/`sgt`/`sgeu`/`sge` since we need them
to get the full icmp functionality.
I've split the actual mask expansion into it's own separate rule since we are
going to need it for the `fcmp` rules as well.
The instruction selection for `icmp` is on a separate rule simply because the
rulse end up less verbose than if they were inlined directly into the `icmp` rule.
* riscv64: Add SIMD Load+Extends
* riscv64: Add SIMD `{u,s}widen_{low,high}`
* riscv64: Add `gen_slidedown_half`
This isn't really necessary yet, but we are going to make a lot of use for it
in the widening arithmetic instructions, so might as well add it now.
* riscv64: Add multi widen SIMD instructions
* riscv64: Typo Fix
* Make wasmtime-types type check
* Make wasmtime-environ type check.
* Make wasmtime-runtime type check
* Make cranelift-wasm type check
* Make wasmtime-cranelift type check
* Make wasmtime type check
* Make wasmtime-wast type check
* Make testsuite compile
* Address Luna's comments
* Restore compatibility with effect-handlers/wasm-tools#func-ref-2
* Add function refs feature flag; support testing
* Provide function references support in helpers
- Always support Index in blocktypes
- Support Index as table type by pretending to be Func
- Etc
* Implement ref.as_non_null
* Add br_on_null
* Update Cargo.lock to use wasm-tools with peek
This will ultimately be reverted when we refer to
wasm-tools#function-references, which doesn't have peek, but does have type
annotations on CallRef
* Add call_ref
* Support typed function references in ref.null
* Implement br_on_non_null
* Remove extraneous flag; default func refs false
* Use IndirectCallToNull trap code for call_ref
* Factor common call_indirect / call_ref into a fn
* Remove copypasta clippy attribute / format
* Add a some more tests for typed table instructions
There certainly need to be many more, but this at least catches the bugs fixed
in the next commit
* Fix missing typed cases for table_grow, table_fill
* Document trap code; remove answered question
* Mark wasm-tools to wasmtime reftype infallible
* Fix reversed conditional
* Scope externref/funcref shorthands within WasmRefType
* Merge with upstream
* Make wasmtime compile again
* Fix warnings
* Remove Bot from the type algebra
* Fix table tests.
`wast::Cranelift::spec::function_references::table`
`wast::Cranelift::spec::function_references::table_pooling`
* Fix table{get,set} tests.
```
wast::Cranelift::misc::function_references::table_get
wast::Cranelift::misc::function_references::table_get_pooling
wast::Cranelift::misc::function_references::table_set
wast::Cranelift::misc::function_references::table_set_pooling
```
* Insert subtype check to fix local_get tests.
```
wast::Cranelift::spec::function_references::local_get
wast::Cranelift::spec::function_references::local_get_pooling
```
* Fix compilation of `br_on_non_null`.
The branch destinations were the other way round... :-)
Fixes the following test failures:
```
wast::Cranelift::spec::function_references::br_on_non_null
wast::Cranelift::spec::function_references::br_on_non_null_pooling
```
* Fix ref_as_non_null tests.
The test was failing due to the wrong error message being printed. As
per upstream folks' suggest we were using the trap code
`IndirectCallToNull`, but it produces an unexpected error message.
This commit reinstates the `NullReference` trap code. It produces the
expected error message. We will have to chat with the maintainers
upstream about how to handle these "test failures".
Fixes the following test failures:
```
wast::Cranelift::spec::function_references::ref_as_non_null
wast::Cranelift::spec::function_references::ref_as_non_null_pooling
```
* Fix a call_ref regression.
* Fix global tests.
Extend `is_matching_assert_invalid_error_message` to circumvent the textual error message failure.
Fixes the following test failures:
```
wast::Cranelift::spec::function_references::global
wast::Cranelift::spec::function_references::global_pooling
```
* Cargo update
* Update
* Spell out some cases in match_val
* Disgusting hack to subvert limitations of type reconstruction.
In the function `wasmtime::values::Val::ty()` attempts to reconstruct
the type of its underlying value purely based on the shape of the
value. With function references proposal this sort of reconstruction
is no longer complete as a source reference type may have been
nullable. Nullability is not inferrable by looking at the shape of the
runtime object alone.
Consequently, the runtime cannot reconstruct the type for
`Val::FuncRef` and `Val::ExternRef` by looking at their respective
shapes.
* Address workflows comments.
* null reference => null_reference for CLIF parsing compliance.
* Delete duplicate-loads-dynamic-memory-egraph (again)
* Idiomatic code change.
* Nullability subtyping + fix non-null storage check.
This commit removes the `hacky_eq` check in `func.rs`. Instead it is
replaced by a subtype check. This subtype check occurs in
`externals.rs` too.
This commit also fixes a bug. Previously, it was possible to store a
null reference into a non-null table cell. I have added to new test
cases for this bug: one for funcrefs and another for externrefs.
* Trigger unimplemented for typed function references. Format values.rs
* run cargo fmt
* Explicitly match on HeapType::Extern.
* Address cranelift-related feedback
* Remove PartialEq,Eq from ValType, RefType, HeapType.
* Pin wasmparser to a fairly recent commit.
* Run cargo fmt
* Ignore tail call tests.
* Remove garbage
* Revert changes to wasmtime public API.
* Run cargo fmt
* Get more CI passing (#19)
* Undo Cargo.lock changes
* Fix build of cranelift tests
* Implement link-time matches relation. Disable tests failing due to lack of public API support.
* Run cargo fmt
* Run cargo fmt
* Initial implementation of eager table initialization
* Tidy up eager table initialisation
* Cargo fmt
* Ignore type-equivalence test
* Replace TODOs with descriptive comments.
* Various changes found during review (#21)
* Clarify a comment
This isn't only used for null references
* Resolve a TODO in local init
Don't initialize non-nullable locals to null, instead skip
initialization entirely and wasm validation will ensure it's always
initialized in the scope where it's used.
* Clarify a comment and skipping the null check.
* Remove a stray comment
* Change representation of `WasmHeapType`
Use a `SignatureIndex` instead of a `u32` which while not 100% correct
should be more correct. This additionally renames the `Index` variant to
`TypedFunc` to leave space for future types which aren't functions to
not all go into an `Index` variant.
This required updates to Winch because `wasmtime_environ` types can no
longer be converted back to their `wasmparser` equivalents. Additionally
this means that all type translation needs to go through some form of
context to resolve indices which is now encapsulated in a `TypeConvert`
trait implemented in various locations.
* Refactor table initialization
Reduce some duplication and simplify some data structures to have a more
direct form of table initialization and a bit more graceful handling of
element-initialized tables. Additionally element-initialize tables are
now treated the same as if there's a large element segment initializing
them.
* Clean up some unrelated chagnes
* Simplify Table bindings slightly
* Remove a no-longer-needed TODO
* Add a FIXME for `SignatureIndex` in `WasmHeapType`
* Add a FIXME for panicking on exposing function-references types
* Fix a warning on nightly
* Fix tests for winch and cranelift
* Cargo fmt
* Fix arity mismatch in aarch64/abi
---------
Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>
Co-authored-by: Daniel Hillerström <daniel.hillerstrom@huawei.com>
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
* riscv64: Support vector instruction masking
* riscv64: Add `vmerge` instructions
* riscv64: Implement `insertlane`
* riscv64: Fix encoding of `vmv` instructions
Some of these carry their source in vs2
* riscv64: Fix formatting of mask register
Remove the space between , and the register. This
is inline with the rest of our formatting.
* riscv64: Restrict `insertlane` to vector types that fit in a single register
* wasmtime: Enable more RISC-V SIMD tests
* riscv64: Use inline format syntax for printing vector instructions
* riscv64: Add vector mask note
* delete adapter src/main.o: this was accidentally left out of #165
* move adapter, byte-array, and verify to a new workspace
* rename byte-array crate to a name available on crates.io
* add a readme for verify, also give it a slightly better name
* CI: wit dep check in its own step, verify before publish, trim down publication
* reactor-tests: delete deps symlinks
* reactor-tests: manage wit with wit-deps
* test: dont set default toolchain to nightly
* wit-deps lock adapter
* wit-deps lock reactor-tests
wit-deps doesnt manage these for some reason
Changing LLVM and/or Rust to avoid special handling of `main` is a fair
amount of work, and there could be other toolchains with similar special
rules for functions named `main`, so rename the command entrypoint back
to `run`.
We could potentially re-evaluate this in the future, such as in a
preview3 timeframe, but for now, let's go with the simplest thing that
works.
* Run some tests in MIRI on CI
This commit is an implementation of getting at least chunks of Wasmtime
to run in MIRI on CI. The full test suite is not possible to run in MIRI
because MIRI cannot run Cranelift-produced code at runtime (aka it
doesn't support JITs). Running MIRI, however, is still quite valuable if
we can manage it because it would have trivially detected
GHSA-ch89-5g45-qwc7, our most recent security advisory. The goal of this
PR is to select a subset of the test suite to execute in CI under MIRI
and boost our confidence in the copious amount of `unsafe` code in
Wasmtime's runtime.
Under MIRI's default settings, which is to use the [Stacked
Borrows][stacked] model, much of the code in `Instance` and `VMContext`
is considered invalid. Under the optional [Tree Borrows][tree] model,
however, this same code is accepted. After some [extremely helpful
discussion][discuss] on the Rust Zulip my current conclusion is that
what we're doing is not fundamentally un-sound but we need to model it
in a different way. This PR, however, uses the Tree Borrows model for
MIRI to get something onto CI sooner rather than later, and I hope to
follow this up with something that passed Stacked Borrows. Additionally
that'll hopefully make this diff smaller and easier to digest.
Given all that, the end result of this PR is to get 131 separate unit
tests executing on CI. These unit tests largely exercise the embedding
API where wasm function compilation is not involved. Some tests compile
wasm functions but don't run them, but compiling wasm through Cranelift
in MIRI is so slow that it doesn't seem worth it at this time. This does
mean that there's a pretty big hole in MIRI's test coverage, but that's
to be expected as we're a JIT compiler after all.
To get tests working in MIRI this PR uses a number of strategies:
* When platform-specific code is involved there's now `#[cfg(miri)]` for
MIRI's version. For example there's a custom-built "mmap" just for
MIRI now. Many of these are simple noops, some are `unimplemented!()`
as they shouldn't be reached, and some are slightly nontrivial
implementations such as mmaps and trap handling (for native-to-native
function calls).
* Many test modules are simply excluded via `#![cfg(not(miri))]` at the
top of the file. This excludes the entire module's worth of tests from
MIRI. Other modules have `#[cfg_attr(miri, ignore)]` annotations to
ignore tests by default on MIRI. The latter form is used in modules
where some tests work and some don't. This means that all future test
additions will need to be effectively annotated whether they work in
MIRI or not. My hope though is that there's enough precedent in the
test suite of what to do to not cause too much burden.
* A number of locations are fixed with respect to MIRI's analysis. For
example `ComponentInstance`, the component equivalent of
`wasmtime_runtime::Instance`, was actually left out from the fix for
the CVE by accident. MIRI dutifully highlighted the issues here and
I've fixed them locally. Some locations fixed for MIRI are changed to
something that looks similar but is subtly different. For example
retaining items in a `Store<T>` is now done with a Wasmtime-specific
`StoreBox<T>` type. This is because, with MIRI's analyses, moving a
`Box<T>` invalidates all pointers derived from this `Box<T>`. We don't
want these semantics, so we effectively have a custom `Box<T>` to suit
our needs in this regard.
* Some default configuration is different under MIRI. For example most
linear memories are dynamic with no guards and no space reserved for
growth. Settings such as parallel compilation are disabled. These are
applied to make MIRI "work by default" in more places ideally. Some
tests which perform N iterations of something perform fewer iterations
on MIRI to not take quite so long.
This PR is not intended to be a one-and-done-we-never-look-at-it-again
kind of thing. Instead this is intended to lay the groundwork to
continuously run MIRI in CI to catch any soundness issues. This feels,
to me, overdue given the amount of `unsafe` code inside of Wasmtime. My
hope is that over time we can figure out how to run Wasm in MIRI but
that may take quite some time. Regardless this will be adding nontrivial
maintenance work to contributors to Wasmtime. MIRI will be run on CI for
merges, MIRI will have test failures when everything else passes,
MIRI's errors will need to be deciphered by those who have probably
never run MIRI before, things like that. Despite all this to me it seems
worth the cost at this time. Just getting this running caught two
possible soundness bugs in the component implementation that could have
had a real-world impact in the future!
[stacked]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md
[tree]: https://perso.crans.org/vanille/treebor/
[discuss]: https://rust-lang.zulipchat.com/#narrow/stream/269128-miri/topic/Tree.20vs.20Stacked.20Borrows.20.26.20a.20debugging.20question
* Update alignment comment
LLVM has special handling for functions named `main`, which we need to
avoid because we're creating the component-level `main` rather than the
C-level `main`. To do this, write the `main` function in assembly, which
is fortunately very simple now.
* Update the spec test suite submodule
Delete the local copies of the relaxed-simd test suite as well as
they're now incorporated.
Closes#5914
* Remove page guards in QEMU emulation
Otherwise `(memory 0 0)` was being compiled as a static memory with huge
guards which we're trying to avoid in QEMU.
* Initial support for the Relaxed SIMD proposal
This commit adds initial scaffolding and support for the Relaxed SIMD
proposal for WebAssembly. Codegen support is supported on the x64 and
AArch64 backends on this time.
The purpose of this commit is to get all the boilerplate out of the way
in terms of plumbing through a new feature, adding tests, etc. The tests
are copied from the upstream repository at this time while the
WebAssembly/testsuite repository hasn't been updated.
A summary of changes made in this commit are:
* Lowerings for all relaxed simd opcodes have been added, currently all
exhibiting deterministic behavior. This means that few lowerings are
optimal on the x86 backend, but on the AArch64 backend, for example,
all lowerings should be optimal.
* Support is added to codegen to, eventually, conditionally generate
different code based on input codegen flags. This is intended to
enable codegen to more efficient instructions on x86 by default, for
example, while still allowing embedders to force
architecture-independent semantics and behavior. One good example of
this is the `f32x4.relaxed_fmadd` instruction which when deterministic
forces the `fma` instruction, but otherwise if the backend doesn't
have support for `fma` then intermediate operations are performed
instead.
* Lowerings of `iadd_pairwise` for `i16x8` and `i32x4` were added to the
x86 backend as they're now exercised by the deterministic lowerings of
relaxed simd instructions.
* Sample codegen tests for added for x86 and aarch64 for some relaxed
simd instructions.
* Wasmtime embedder support for the relaxed-simd proposal and forcing
determinism have been added to `Config` and the CLI.
* Support has been added to the `*.wast` runtime execution for the
`(either ...)` matcher used in the relaxed-simd proposal.
* Tests for relaxed-simd are run both with a default `Engine` as well as
a "force deterministic" `Engine` to test both configurations.
* All tests from the upstream repository were copied into Wasmtime.
These tests should be deleted when WebAssembly/testsuite is updated.
* x64: Add x86-specific lowerings for relaxed simd
This commit builds on the prior commit and adds an array of `x86_*`
instructions to Cranelift which have semantics that match their
corresponding x86 equivalents. Translation for relaxed simd is then
additionally updated to conditionally generate different CLIF for
relaxed simd instructions depending on whether the target is x86 or not.
This means that for AArch64 no changes are made but for x86 most relaxed
instructions now lower to some x86-equivalent with slightly different
semantics than the "deterministic" lowering.
* Add libcall support for fma to Wasmtime
This will be required to implement the `f32x4.relaxed_madd` instruction
(and others) when an x86 host doesn't specify the `has_fma` feature.
* Ignore relaxed-simd tests on s390x and riscv64
* Enable relaxed-simd tests on s390x
* Update cranelift/codegen/meta/src/shared/instructions.rs
Co-authored-by: Andrew Brown <andrew.brown@intel.com>
* Add a FIXME from review
* Add notes about deterministic semantics
* Don't default `has_native_fma` to `true`
* Review comments and rebase fixes
---------
Co-authored-by: Andrew Brown <andrew.brown@intel.com>