The toml file specifies version `0.4.1` instead of `0.4.2`.
Using version `0.4.1` produces a compile error:
```
error[E0432]: unresolved import `mach2::ndr`
--> external/crate_index__wasmtime-runtime-20.0.2/src/sys/unix/machports.rs:44:12
|
44 | use mach2::ndr::*;
| ^^^ could not find `ndr` in `mach2`
```
That's because `ndr` was added in version `0.4.2`.
Note that the lock file specifies version `0.4.2` which explains
why this error doesn't happen normally.
This introduces a `DecommitQueue` for batching decommits together in the pooling
allocator:
* Deallocating a memory/table/stack enqueues their associated regions of memory
for decommit; it no longer immediately returns the associated slot to the
pool's free list. If the queue's length has reached the configured batch size,
then we flush the queue by running all the decommits, and finally returning
the memory/table/stack slots to their respective pools and free lists.
* Additionally, if allocating a new memory/table/stack fails because the free
list is empty (aka we've reached the max concurrently-allocated limit for this
entity) then we fall back to a slow path before propagating the error. This
slow path flushes the decommit queue and then retries allocation, hoping that
the queue flush reclaimed slots and made them available for this fallback
allocation attempt. This involved defining a new `PoolConcurrencyLimitError`
to match on, which is also exposed in the public embedder API.
It is also worth noting that we *always* use this new decommit queue now. To
keep the existing behavior, where e.g. a memory's decommits happen immediately
on deallocation, you can use a batch size of one. This effectively disables
queueing, forcing all decommits to be flushed immediately.
The default decommit batch size is one.
This commit, with batch size of one, consistently gives me an increase on
`wasmtime serve`'s requests-per-second versus its parent commit, as measured by
`benches/wasmtime-serve-rps.sh`. I get ~39K RPS on this commit compared to ~35K
RPS on the parent commit. This is quite puzzling to me. I was expecting no
change, and hoping there wouldn't be a regression. I was not expecting a speed
up. I cannot explain this result at this time.
prtest:full
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
* Change `MemoryStyle::Static` to store bytes, not pages
This commit is inspired by me looking at some configuration in the
pooling allocator and noticing that configuration of wasm pages vs bytes
of linear memory is somewhat inconsistent in `Config`. In the end I'd
like to remove or update the `memory_pages` configuration in the pooling
allocator to being bytes of linear memory instead to be more consistent
with `Config` (and additionally anticipate the custom-page-sizes
wasm proposal where terms-of-pages will become ambiguous). The first
step in this change is to update one of the lowest layered usages of
pages, the `MemoryStyle::Static` configuration.
Note that this is not a trivial conversion because the purpose of
carrying around pages instead of bytes is that bytes may overflow where
overflow-with-pages typically happens during validation. This means that
extra care is taken to handle errors related to overflow to ensure that
everything is still reported at the same time.
* Update crates/wasmtime/src/runtime/vm/instance/allocator/pooling/memory_pool.rs
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Fix tests
* Really fix tests
---------
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
This is the final type system change for Wasm GC: the ability to explicitly
declare supertypes and finality. A final type may not be a supertype of another
type. A concrete heap type matches another concrete heap type if its concrete
type is a subtype (potentially transitively) of the other heap type's concrete
type.
Next, I'll begin support for allocating GC structs and arrays at runtime.
I've also implemented `O(1)` subtype checking in the types registry:
In a type system with single inheritance, the subtyping relationships between
all types form a set of trees. The root of each tree is a type that has no
supertype; each node's immediate children are the types that directly subtype
that node.
For example, consider these types:
class Base {}
class A subtypes Base {}
class B subtypes Base {}
class C subtypes A {}
class D subtypes A {}
class E subtypes C {}
These types produce the following tree:
Base
/ \
A B
/ \
C D
/
E
Note the following properties:
1. If `sub` is a subtype of `sup` (either directly or transitively) then
`sup` *must* be on the path from `sub` up to the root of `sub`'s tree.
2. Additionally, `sup` *must* be the `i`th node down from the root in that path,
where `i` is the length of the path from `sup` to its tree's root.
Therefore, if we maintain a vector containing the path to the root for each
type, then we can simply check if `sup` is at index `supertypes(sup).len()`
within `supertypes(sub)`.
* Remove unused generated `add_root_to_linker` method
* WIP: bindgen GetHost
* Compile with Rust 1.78+
Use <https://users.rust-lang.org/t/generic-closure-returns-that-can-capture-arguments/76513/3>
as a guide of how to implement this by making the `GetHost` trait a bit
uglier.
* Add an option to skip `&mut T -> T` impls
Also enable this for WASI crates since they do their own thing with
`WasiView` for now. A future refactoring should be able to remove this
option entirely and switch wasi crates to a new design of `WasiView`.
* Update test expectations
* Review comments
* Undo temporary change
* Handle some TODOs
* Remove no-longer-relevant note
* Fix wasmtime-wasi-http doc link
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
The egraph pass and the dead-code elimination pass both remove
instructions whose results are unused. If the optimization level is
"none", neither pass runs, and if it's anything else both passes run. I
don't think we should do this work twice.
Note that the DCE pass is different than the "eliminate unreachable
code" pass, which removes entire blocks that are unreachable from the
entry block. That pass might still be necessary.
* Move wast tests to their own test suite
This commit moves testing of `*.wast` files out of the `all` test suite
binary and into its own separate binary. The motivation for this is
well-described in #4861 with one of the chief reasons being that if the
test suite is run and then a new file is added re-running the test suite
won't see the file.
The `libtest-mimic` crate provides an easy way of regaining most of the
features of the `libtest` harness such as parallel test execution and
filters, meaning that it's pretty easy to switch everything over. The
only slightly-tricky bit was redoing the filter for whether a test is
ignored or not, but most of the pieces were copied over from the
previous `build.rs` logic.
Closes#4861
* Fix the `all` suite
* Review comments
* Add Android release binaries to CI
This commit is inspired by #6480 and historical asks for Android
binaries. This does the bare minimum necessary to configure C compilers
such that we can produce binaries but I'll admit that I'm no Android
developer myself so I have no idea if these are actually suitable for
use anywhere. Otherwise though this build subsumes the preexisting check
in CI that the build works for Android, so that part is removed too.
This additionally changes how the NDK is managed from before. Previously
a GitHub Action was used to download Java and the NDK and additionally
used the `cargo ndk` subcommand. That's all removed now in favor of
configuring C compilers directly with a pre-installed version of the NDK
which should help reduce the CI dependencies a bit.
* Review comments
* List Android as tier 3 target
* riscv64: Improve pattern matching rules for FMA.
This commit reworks our FMA pattern matching to be slightly less verbose. It additionally adds the `(fma x (splat y) z)` pattern for vectors, which can be proven to be equivalent to `(splat x)`.
Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
* riscv64: Remove unused rule priority
---------
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Since there are no calls left to AllocationConsumer's methods, this
commit just deletes all the arguments of this type, as well as the
definition of the type itself.
* Don't differentiate between decommitting table vs stack pages
All implementations are the same either way, and by treating them differently,
it makes it difficult to prototype integration with batching syscalls that need
to treat them homogenously.
Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
* Remove the `madvise_dontneed` function from the system virtual memory interface
This was already implemented by the existing `decommit_pages` function, we just
need to determine what the behavior is for the left over mapping. Specifically,
whether it is zeroed or restored to the original mapping (e.g. a CoW image).
This allows us to handle `decommit_pages` and what previously were calls to
`madvise_dontneed` homogenously, which enables us to prototype new system calls
that batch decommits together and therefore must treat them all the same.
Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
---------
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
With cargo-vet the cross-organization trust model is not quite the same
with these two constructs in cargo-vet. Previously Wasmtime/wasm-tools
crates were flagged as `[[wildcard-audits]]` but now being changed to
all using `wasmtime-publish` to publish crates the `[[trusted]]` entries
were added at the recommendation of `cargo vet`. This means that other
organizations could no longer import our own audits since `[[trusted]]`
entries aren't imported, only suggested.
This commit changes all these entries to `wildcard-audits` with an
explanation as to why.
* wasmtime(gc): Fix wasm-to-native trampoline lookup for subtyping
Previously, we would look up a wasm-to-native trampoline in the Wasm module
based on the host function's type. With Wasm GC and subtyping, this becomes
problematic because a Wasm module can import a function of type `T` but the host
can define a function of type `U` where `U <: T`. And if the Wasm has never
defined type `U` then it wouldn't have a trampoline for it. But our trampolines
don't actually care, they treat all reference values within the same type
hierarchy identically. So the trampoline for `T` would have worked in
practice. But once we find a trampoline for a function, we cache it and reuse it
every time that function is used in the same store again. Even if the function
is imported with its precise type somewhere else. So then we would have a
trampoline of the wrong type. But this happened to be okay in practice because
the trampolines happen not to inspect their arguments or do anything with them
other than forward them between calling convention locations. But relying on
that accidental invariant seems fragile and like a gun aimed at the future's
feet.
This commit makes that invariant non-accidental, centering it and hopefully
making it less fragile by doing so, by making every function type have an
associated "trampoline type". A trampoline type is the original function type
but where all the reference types in its params and results are replaced with
the nullable top versions, e.g. `(ref $my_struct)` is replaced with `(ref null
any)`. Often a function type is its own associated trampoline type, as is the
case for all functions that don't have take or return any references, for
example. Then, all trampoline lookup begins by first getting the trampoline type
of the actual function type, or actual import type, and then only afterwards
finding for the pre-compiled trampoline in the Wasm module.
Fixes https://github.com/bytecodealliance/wasmtime/issues/8432
Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
* Fix no-std build
---------
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
The `next` and `next_writable` methods on `AllocationConsumer` are
identity functions now, so replace each call with its argument and then
clean up the resulting mess.
Most of this commit was generated with these commands:
- `git grep -lF allocs.next cranelift/codegen/src/isa/ |
xargs sed -i 's/allocs\.next\(_writable\)\?//'`
- `cargo fix -p cranelift-codegen --features all-arch --allow-dirty --allow-staged`
- `git diff --name-only HEAD | xargs sed -i '/let \([^ ]*\) = \1;/d'`
- `cargo fmt`
I used sed to delete `allocs.next` but left the following parentheses
alone (since matching balanced parentheses is not a regular language).
Then I used `cargo fix` to remove those parentheses and also add
underscores to newly-unused `AllocationConsumer` arguments. Next I used
sed again to delete trivial assignments like `let x = x`, leaving more
complicated cases as future work to clean up, and finally `cargo fmt` to
delete blank lines and braces that are no longer necessary.
Last, I deleted the newly-unused definitions of `next` and
`next_writable` themselves.
* Use WASI builder directly in C API
This commit updates the C API to use the `WasiCtxBuilder` directly
within `wasi_config_t` instead of buffering up the options separately.
This keeps the behavior of the Rust-based API more similar to the C API
and should also help resolve#8552 due to errors being returned more
eagerly in the builder-based API.
This additionally makes some minor modifications to the C APIs here as
appropriate.
Close#8552
* Review comments
The `next_fixed_nonallocatable` method doesn't do anything any more and
doesn't return anything so calls to it can just be deleted.
The `with_allocs`, `allocate`, and `to_string_with_alloc` methods are
all trivial at this point, so inline them. The bulk of this change was
performed this way:
git grep -lF '.with_allocs(' | xargs sed -i 's/\.with_allocs([^)]*)/.clone()/g'
In a couple cases, this makes the `AllocationConsumer` unused at these
methods' call sites. Rather than changing function signatures in this
PR, just mark those arguments as deliberately unused.
The number of structures being cloned here is unfortunate, and
unnecessary now that we don't need to mutate any of them. But switching
to borrowing them is a bigger change than I want to include here.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=68735.
That fuzzbug bisected to the call-indirect caching changes, but this
turned out to be a red herring (the options added in that PR mean that
the fuzzbug config deserializes differently prior to the commit). In
any case, it's an easy fix -- it appears that V8 added a new error
message, so we need to add it to the allowlist of messages that we
expect for a table out-of-bounds condition.
This is extremely useful for cases where the default optimizations just
are not enough.
Background: [neovim](https://github.com/neovim/neovim) is interested to
add wasmtime support in https://github.com/neovim/neovim/pull/28415 but
we noticed that including wasmtime, even when not using wasmtime
directly, heavily affects runtime performance. This is not only
reflected in the increased startuptime but affects the runtime
performance overall.
Here are the benchmarks for startuptimes for different configurations.
Important to note is that all of runtime is affected, but the
startuptime is a decent proxy to measure runtime performance:
```
No wasm
Time (mean ± σ): 50.5 ms ± 1.5 ms [User: 32.8 ms, System: 12.3 ms]
Range (min … max): 48.3 ms … 54.4 ms 56 runs
Wasm, lto=thin
Time (mean ± σ): 104.9 ms ± 3.5 ms [User: 86.5 ms, System: 12.7 ms]
Range (min … max): 99.5 ms … 111.1 ms 26 runs
Wasm, lto=true
Time (mean ± σ): 83.8 ms ± 2.5 ms [User: 65.8 ms, System: 12.1 ms]
Range (min … max): 80.5 ms … 93.3 ms 31 runs
Wasm, lto=true, strip="none", incremental=false, codegen-units=1, panic="abort"
Time (mean ± σ): 53.1 ms ± 1.0 ms [User: 35.5 ms, System: 12.5 ms]
Range (min … max): 50.6 ms … 55.5 ms 54 runs
```
* Bump Wasmtime's MSRV to 1.76.0
* Update Rust in CI to 1.78.0, the current stable
* Update nightly tests to the latest nightly
prtest:full
* Fix check-cfg with nightly
* More check-cfg fixes
* Remove an async cfg
This is no longer specified for the root crate.
* Move definition of Wasmtime's nightly into one place
Don't change a bunch of places when this is updated, try to update just
one single location instead.
* Update Wasmtime's tier stability documentation
Move some items between tiers and add a few misc items here and there.
* Update platform support documentation
Re-word lots of this since it was originally written, link to the tiers
of support page, and rewrite the section on `no_std`.
* Update the `min-platform` example with no_std
This commit updates the preexisting `min-platform` example to no longer
require Nightly Rust and instead use the `no_std` support now added to
Wasmtime. This involved:
* Change the build process to produce a staticlib which is then manually
converted via `cc` into a shared library for the native Linux platform.
* Compile the modules outside of the embedding and only `deserialize`
within the embedding.
* Update the `indexmap` dependency to pick up a bug fix required in
`no_std` mode (apparently, it fails on indexmap@2.0.0 and passes at
2.2.6, I didn't dig much further).
This commit additionally makes the `wasmtime-platform.h` header file
generated by the example a release artifact for Wasmtime itself. The
header itself is touched up a bit by configuring some more `cbindgen`
options as well.
* Fix clippy build
prtest:full
* Review comments
* Pass gc-sections to linking the library
* Always fall back to custom platform for Wasmtime
This commit updates Wasmtime's platform support to no longer require an
opt-in `RUSTFLAGS` `--cfg` flag to be specified. With `no_std` becoming
officially supported this should provide a better onboarding experience
where the fallback custom platform is used. This will cause linker
errors if the symbols aren't implemented and searching/googling should
lead back to our docs/repo (eventually, hopefully).
* Change Wasmtime's TLS state to a single pointer
This commit updates the management of TLS to rely on just a single
pointer rather than a pair of a pointer and a `bool`. Additionally
management of the TLS state is pushed into platform-specific modules to
enable different means of managing it, namely the "custom" platform now
has a C function required to implement TLS state for Wasmtime.
* Delay conversion to `Instant` in atomic intrinsics
The `Duration` type is available in `no_std` but the `Instant` type is
not. The intention is to only support the `threads` proposal if `std` is
active but to assist with this split push the `Duration` further into
Wasmtime to avoid using a type that can't be mentioned in `no_std`.
* Gate more parts of Wasmtime on the `profiling` feature
Move `serde_json` to an optional dependency and gate the guest profiler
entirely on the `profiling` feature.
* Refactor conversion to `anyhow::Error` in `wasmtime-environ`
Have a dedicated trait for consuming `self` in addition to a
`Result`-friendly trait.
* Gate `gimli` in Wasmtime on `addr2line`
Cut down the dependency list if `addr2line` isn't enabled since then
the dependency is not used. While here additionally lift the version
requirement for `addr2line` up to the workspace level.
* Update `bindgen!` to have `no_std`-compatible output
Pull most types from Wasmtime's `__internal` module as the source of
truth.
* Use an `Option` for `gc_store` instead of `OnceCell`
No need for synchronization here when mutability is already available in
the necessary contexts.
* Enable embedder-defined host feature detection
* Add `#![no_std]` support to the `wasmtime` crate
This commit enables compiling the `runtime`, `gc`, and `component-model`
features of the `wasmtime` crate on targets that do not have `std`. This
tags the crate as `#![no_std]` and then updates everything internally to
import from `core` or `alloc` and adapt for the various idioms. This
ended up requiring some relatively extensive changes, but nothing too
too bad in the grand scheme of things.
* Require `std` for the perfmap profiling agent
prtest:full
* Fix build on wasm
* Fix windows build
* Remove unused import
* Fix Windows/Unix build without `std` feature
* Fix some doc links
* Remove unused import
* Fix build of wasi-common in isolation
* Fix no_std build on macos
* Re-fix build
* Fix standalone build of wasmtime-cli-flags
* Resolve a merge conflict
* Review comments
* Remove unused import
* riscv64: Split float rounding tests in separate files
* riscv64: Move Float Round Instructions to ISLE
This commit moves the lowerings for the various rounding instructions (ceil,floor,trunc,nearest) into ISLE.
The algorithm is mostly unchanged, but with some slight tweaks to generate fewer instructions.
* riscv64: Transform canonical NaN's into arithmetic Nan's for inexact values in round functions
* cranelift: Rewrite operands with allocations in-place
This makes all other code that uses AllocationConsumer into no-ops
because the list of allocations in all other places are now always
empty.
There is a bunch of dead code still left over after this. This part is
just the stuff I can delete without changing function signatures
everywhere. The various functions which are now no-ops are called in a
lot of places and cleaning those up will take more careful review.
* Make OperandVisitor implementable with a closure
Instead of extending AllocationConsumer to update operands in place,
introduce a new type and delete the rest of the internals of
AllocationConsumer.
This involves deleting the lifetime annotations from all uses of
AllocationConsumer. I could have kept a `PhantomData` or something
around but didn't feel like bothering.
* Debug-assert that AllocationConsumer has no allocations to consume
This is what justifies deleting all this code, so let's go ahead and
assert it until the code is actually gone.
* Call-indirect caching: protect against out-of-bounds table index during prescan.
Call-indirect caching requires a "prescan" of a module's code section
during compilation in order to know which tables are possibly written,
and to count call-indirect callsites so that each separate function
compilation can enable caching if possible and use the right slot
range.
This prescan is not integrated with the usual validation logic (nor
should it be, probably, for performance), so it's possible that an
out-of-bounds table index or other illegal instruction could be
present. We previously indexed into an internal data
structure (`table_plans`) with this index, allowing for a compilation
panic on certain invalid modules before validation would have caught
it. This PR fixes that with a one-off check at the single point that
we interpret a parameter (the table index) from an instruction during
the prescan.
* Add test case.
* Respect pooling allocation options in `wasmtime serve`
This commit updates the processing of pooling allocator options in
`wasmtime serve`. Previously the pooling allocator was enabled by
default but the options to configure it weren't processed due to how
this default-enable was implemented. The option to enable it by default
for `wasmtime serve`, but only `wasmtime serve`, is now processed
differently in a way that handles various other
pooling-allocator-related options.
Closes#8504
* Fix compile of bench api
* Fix test build
* Ignore newly added test as it's flaky
This commit fixes a fuzz bug that popped up where the
`cache_call_indirects` feature wasn't reading memory64-based offsets
correctly. This is due to (not great) API design in `wasmparser`
(introduced by me) where wasmparser by default won't read 64-bit offsets
unless explicitly allowed to. This is to handle how spec tests assert
that overlong 32-bit encodings are invalid as the memory64 proposal
isn't merged into the spec yet.
The fix here is to call `allow_memarg64` with whether memory64 is
enabled or not and then that'll enable reading these overlong and/or
larger offsets correctly.
As per [this comment], the call-indirect caching options should really
be in the `-O` option namespace (for optimizations), not `-W` (for
semantically-visible Wasm standards options); the original PR simply put
them in the wrong place, and this PR moves them.
[this comment]: https://github.com/bytecodealliance/wasmtime/pull/8509#discussion_r1589594152