* Update the wasm-tools family of crates
This notably brings in a limitation where component model flags types
must have 32 or fewer flags in accordance with the transition plan of
https://github.com/WebAssembly/component-model/issues/370. A feature
flag is added to go back to the previous behavior to avoid breaking
anyone too much.
This additionally brings in a fix for a panic when validating invalid
modules with tail calls.
* Add vet entries
* Use Ubuntu-16.04 for x86_64-linux binary-compatible-builds
* Revert "Use Ubuntu-16.04 for x86_64-linux binary-compatible-builds"
This reverts commit 5625941dee.
* Use AlmaLinux 8
prtest:full
Fixes: https://github.com/bytecodealliance/wasmtime/issues/8848
Similar to all the control instructions, any state must be explicitly
saved before emitting the code for `br_if`.
This commit ensures that live locals and registers are explicilty saved
before emitting the code for `br_if`. Prior to this commit, live
locals and registers were not saved every time causing incorrect
behavior in cases where the calculation of the conditional argument
didn't trigger a spill.
This change introduces the explicit spill after calculating the branch
condition argument to minimize memory traffic in case the conditional is
already in a register.
This commit fixes writes to stdout/stderr which don't end in a newline
to not get split across lines with a prefix on each line. Instead
internally a flag is used to track whether a prefix is required at the
beginning of each chunk.
* wasi-nn: use resources
Recent discussion in the wasi-nn proposal (see [wasi-nn#59], e.g.) has
concluded that the right approach for representing wasi-nn "things"
(tensors, graph, etc.) is with a component model _resource_. This
sweeping change brings Wasmtime's implementation in line with that
decision.
Initially I had structured this PR to remove all of the WITX-based
implementation (#8530). But, after consulting in a Zulip [thread] on
what other WASI proposals aim to do, this PR pivoted to support _both_`
the WITX-based and WIT-based ABIs (e.g., preview1 era versus preview2,
component model era). What is clear is that the WITX-based specification
will remain "frozen in time" while the WIT-based implementation moves
forward.
What that means for this PR is a "split world" paradigm. In many places,
we have to distinguish between the `wit` and `witx` versions of the same
thing. This change isn't the end state yet: it's a big step forward
towards bringing Wasmtime back in line with the WIT spec but, despite my
best efforts, doesn't fully fix all the TODOs left behind over several
years of development. I have, however, taken the liberty to refactor and
fix various parts as I came across them (e.g., the ONNX backend). I plan
to continue working on this in future PRs to figure out a good error
paradigm (the current one is too wordy) and device residence.
[wasi-nn#59]: https://github.com/WebAssembly/wasi-nn/pull/59
[thread]: https://bytecodealliance.zulipchat.com/#narrow/stream/219900-wasi/topic/wasi-nn's.20preview1.20vs.20preview2.20timeline
prtest:full
* vet: audit `ort`-related crate updates
* Simplify `WasiNnView`
With @alexcrichton's help, this change removes the `trait WasiNnView`
and `struct WasiNnImpl` wrapping that the WIT-based implementation used
for accessing the host context. Instead, `WasiNnView` is now a `struct`
containing the mutable references it needs to make things work. This
unwraps one complex layer of abstraction, though it does have the
downside that it complicates CLI code to split borrows of `Host`.
* Temporarily disable WIT check
* Refactor errors to use `trappable_error_type`
This change simplifies the return types of the host implementations of
the WIT-based wasi-nn. There is more work to be done with errors, e.g.,
to catch up with the upstream decision to return errors as resources.
But this is better than the previous mess.
* Cranelift: Take user stack maps through lowering and emission
Previously, user stack maps were inserted by the frontend and preserved in the
mid-end. This commit takes them from the mid-end CLIF into the backend vcode,
and then from that vcode into the finalized mach buffer during emission.
During lowering, we compile the `UserStackMapEntry`s into packed
`UserStackMap`s. This is the appropriate moment in time to do that coalescing,
packing, and compiling because the stack map entries are immutable from this
point on.
Additionally, we include user stack maps in the `Debug` and disassembly
implementations for vcode, just after their associated safepoint
instructions. This allows us to see the stack maps we are generating when
debugging, as well as write filetests that check we are generating the expected
stack maps for the correct instructions.
Co-Authored-By: Trevor Elliott <telliott@fastly.com>
* uncomment debug assert that was commented out for debugging
* Address review feedback
* remove new method that was actually never needed
---------
Co-authored-by: Trevor Elliott <telliott@fastly.com>
In the original development of this feature, guided by JS AOT
compilation to Wasm of a microbenchmark heavily focused on IC sites, I
was seeing a ~20% speedup. However, in more recent measurements, on full
programs (e.g., the Octane benchmark suite), the benefit is more like
5%.
Moreover, in #8870, I attempted to switch over to a direct-mapped cache,
to address a current shortcoming of the design, namely that it has a
hard-capped number of callsites it can apply to (50k) to limit impact on
VMContext struct size. With all of the needed checks for correctness,
though, that change results in a 2.5% slowdown relative to no caching at
all, so it was dropped.
In the process of thinking through that, I discovered the current design
on `main` incorrectly handles null funcrefs: it invokes a null code pointer,
rather than loading a field from a null struct pointer. The latter was
specifically designed to cause the necessary Wasm trap in #8159, but I
had missed that the call to a null code pointer would not have the same
effect. As a result, we actually can crash the VM (safely at least, but
still no good vs. a proper Wasm trap!) with the feature enabled. (It's
off by default still.) That could be fixed too, but at this point with
the small benefit on real programs, together with the limitation on
module size for full benefit, I think I'd rather opt for simplicity and
remove the cache entirely.
Thus, this PR removes call-indirect caching. It's not a direct revert
because the original PR refactored the call-indirect generation into
smaller helpers and IMHO it's a bit nicer to keep that. But otherwise
all traces of the setting, code pre-scan during compilation and special
conditions tracked on tables, and codegen changes are gone.
* riscv64: Increase max inst size
* riscv64: Emit islands in return call sequence
* riscv64: Update worst case size tests
Having duplicate registers was preventing
some moves from being generated
* Improve some documentation of the `wasmtime-wasi` crate
Show a few examples of using `with` to point to upstream `wasmtime-wasi`
for bindings.
* Refactor and document the `wasmtime-wasi-http` more
This commit primarily adds a complete example of using
`wasmtime-wasi-http` to the documentation. Along the way I've done a
number of other refactorings too:
* `bindgen!`-generated `*Pre` structures now implement `Clone`.
* `bindgen!`-generated `*Pre` structures now have an `engine` method.
* `bindgen!`-generated `*Pre` structures now have an `instance_pre` method.
* The structure of `wasmtime-wasi-http` now matches `wasmtime-wasi`,
notably:
* The `proxy` module is removed
* `wasmtime_wasi_http::add_to_linker_{a,}sync` is the top level
add-to-linker function.
* The `bindings` module now contains `Proxy` and `ProxyPre` along with
a `sync` submodule.
* The `bindings` module contains all bindings for `wasi:http` things.
* The `add_only_*` methods are un-hidden and documented.
* Code processing `req` has been simplified by avoiding
decomposing-and-reconstructing a request.
* The `new_incoming_request` method is now generic to avoid callers
having to do boxing/mapping themselves.
* Update expanded macro expectations
* Remove unused import
This reduces the size of wasi_snapshot_preview1.command.wasm from 75029
bytes to 52212 bytes for a total win of 22817 bytes. This is done by
deduplicating most of the trap messages and the code for printing those
trap messages. Also got some small wins by making the assertion message
shorter.
This reduces the size of wasi_snapshot_preview1.command.wasm from 79625
bytes to 75029 bytes for a total win of 4596 bytes. Of this reduction
enabling LTO is responsible for 3103 bytes, while enabling bulk-memory
is responsible for 1493 bytes
* upgrade to wasm-tools 0.211.1
* code review
* cargo vet: auto imports
* fuzzing: fix wasm-smith changes
* fuzzing: changes for HeapType
* Configure features on `Parser` when parsing
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
The identifier for the `cold` calling convention overlaps with the
`cold` keyword for basic blocks so handle another kind of token when
parsing signatures.
The epoch interruption implementation caches the current deadline in a
register, and avoids reloading that cache until the cached deadline has
passed.
However, the first epoch check happens immediately after the cache has
been populated on function entry, so there's never a reason to reload
the cache at that point. It only needs to be reloaded in loops. So this
commit eliminates the double-check on function entry.
When Cranelift optimizations are enabled, the alias analysis pass
correctly detected that this load was redundant, and the egraph pass
optimized away the `icmp` as well. However, since we don't do
conditional constant propagation, the branch couldn't be optimized away.
On x86 this lowered to a redundant `cmp`/`jae` pair of instructions in
every function entry, which this commit eliminates.
To keep track of what code we're generating for epoch interruptions,
I've also added disas tests with a trivial infinite loop.
This was accidentally broken in #8692. It turns out bitcasts from i128 to i128 are legal, that PR accidentally reverted that use case.
This is now added to a runtest to ensure it works on all platforms.
This commit raises the default setting of `max_memory_size` in the
pooling allocator from 10M to 4G. This won't actually impact the virtual
memory reserved in the pooling allocator because we already reserved 6G
of virtual memory for each linear memory this instead allows access to
all of it by default. This matches the default behavior of Wasmtime for
the non-pooling allocator which is to not artificially limit memory by
default.
The main impact of this setting is that the memory-protection-keys
feature, which is disabled by default, will have no effect by default
unless `max_memory_size` is also configured to something smaller than
4G. The documentation has been updated to this effect.
Closes#8846
I noticed that the wasm_memory64 flag was left out of Config's debug impl,
so rather than add it, I decided to use the `bitflags::Flags::FLAGS`
const to iterate the complete set of flags.
THe downside of this change is that it will print flags which do not
have a setter in Config, e.g. `wasm_component_model_nested_names`.
An alternative to this change is, rather than expanding out the single
`features: WasmFeatures` member into many different debug_struct fields,
the debug impl of WasmFeatures is used.
Here is a sample debug of Config with this change:
Config { debug_info: None, wasm_mutable_global: true, wasm_saturating_float_to_int: true, wasm_sign_extension: true, wasm_reference_types: true, wasm_multi_value: true, wasm_bulk_memory: true, wasm_simd: true, wasm_relaxed_simd: false, wasm_threads: false, wasm_shared_everything_threads: false, wasm_tail_call: false, wasm_floats: true, wasm_multi_memory: false, wasm_exceptions: false, wasm_memory64: false, wasm_extended_const: false, wasm_component_model: false, wasm_function_references: false, wasm_memory_control: false, wasm_gc: false, wasm_custom_page_sizes: false, wasm_component_model_values: false, wasm_component_model_nested_names: false, parallel_compilation: true, compiler_config: CompilerConfig { strategy: Some(Cranelift), target: None, settings: {"opt_level": "speed", "enable_verifier": "true"}, flags: {}, cache_store: None, clif_dir: None, wmemcheck: false }, parse_wasm_debuginfo: false }
This commit removes the `simm32` extractor from lowerings as it's not as
useful as it was when it was first introduced. Nowadays an `Imm64` needs
to be interpreted with the type known as well to understand whether bits
being masked off is significant or not. The old `simm32` extractor only
took `Imm64` meaning that it was unable to do this and wouldn't match
negative numbers. This is because the high 32 bits of `Imm64` were
always zero and `simm64` would take the `i64` value from `Imm64` and try
to convert it to an `i32`.
This commit replaces `simm32`, and uses of it, with a new extractor
`i32_from_iconst`. This matches the preexisting `i64_from_iconst` and is
able to take the type of the value into account and produce a correctly
sign-extended value.
cc #8706
* Add tests for patterns I'm about to optimize
* x64: Optimize vector compare-and-branch
This commit implements lowering optimizations for the `vall_true` and
`vany_true` CLIF instructions when combined with `brif`. This is in the
same manner as `icmp` and `fcmp` combined with `brif` where the result
of the comparison is never materialized into a general purpose register
which helps lower register pressure and remove some instructions.
* x64: Optimize `vconst` with an all-ones pattern
This has a single-instruction lowering which doesn't load from memory so
it's probably cheaper than loading all-ones from memory.
* cranelift-entity: Implement `EntitySet` in terms of `cranelift_bitset::CompoundBitSet`
* Shrink the size of `CompoundBitSet` so we don't perturb vmctx size test expectations
* Update vmctx size test expectations anyways because we shrunk "too much"
* Move `cranelift-bitset` to the front of `CRATES_TO_PUBLISH`
* wasmtime: Add profile markers around host-calls
The output of the guest profiler can be misleading around hostcalls.
Whatever happened to be the last sample before the hostcall appears to
run for the entire time of the hostcall. This change ensures that we can
see the actual call stack at the time of the hostcall, and get a visual
indication of which periods are not spent executing guest code.
* wasmtime-cli needs wasmtime/call-hook, but wasmtime itself doesn't
In general, embedders that wish to use the new functionality likely will
need to enable the wasmtime/call-hook feature in order to get Wasmtime
to notify them of when to call into the profiler. However embedders
could consider other alternatives, such as calling the profiler from
selected hostcall implementations.
* Implement semver compatibility for exports
This commit is an implementation of component model semver compatibility
for export lookups. Previously in #7994 component imports were made
semver-aware to ensure that bumping version numbers would not be a
breaking change. This commit implements the same feature for component
exports. This required some refactoring to move the definition of semver
compat around and the previous refactoring in #8786 enables frontloading
this work to happen before instantiation.
Closes#8395
* Review comments
* Fix tests
This commit improves the experience around using the
`trappable_error_type` configuration by fixing two issues:
* When an error can't be resolved it doesn't result in a
`unwrap()`, instead a first-class error is returned to get reported.
* The name lookup procedure is now consistent with the name lookup that
the `with` key does, notably allowing the version to be optional but
still supporting the version.
This fixes an issue that came up recently where a path with a version
was specified but the old lookup logic ended up requiring that the
version wasn't specified because there was only one package with that
version. This behavior resulted in a panic with a very long
backtrace-based error message which was confusing to parse. By returning
an error the error is much more succinct and by supporting more names
the original intuition will work.
* Introduce the `cranelift-bitset` crate
The eventual goal is to deduplicate bitset types between Cranelift and Wasmtime,
especially their use in stack maps.
* Use the `cranelift-bitset` crate inside both Cranelift and Wasmtime
Mostly for stack maps, also for a variety of other random things where
`cranelift_codegen::bitset::BitSet` was previously used.
* Fix stack maps unit test in cranelift-codegen
* Uncomment `no_std` declaration
* Fix `CompountBitSet::reserve` method
* Fix `CompoundBitSet::insert` method
* Keep track of the max in a `CompoundBitSet`
Makes a bunch of other stuff easier, and will be needed for replacing
`cranelift_entity::EntitySet`'s bitset with this thing anyways.
* Add missing parens
* Fix a bug around insert and reserve
* Implement `with_capacity` in terms of `new` and `reserve`
* Rename `reserve` to `ensure_capacity`
* wasi-adapter: Implement provider crate that embeds the adapter binaries
* Upgrade wasi adapters to the latest version
* Update adapter docs
* Recompile asi adapters with 1.78
* Recompile wasi adapters with 1.79
* Add some debugging to adapter build script
* Fix script debugging
* Compute wasi adapter version based on latest adapter commit hash
* Try to bless wasi adapters again
* Try to work around CI auto-merges
* Revert to just using workspace version
* Add the wasi adapter provider to the crate publication list
* Use wasi adapter provider in artifacts test + explicit MSRV in CI
* Explicit adapter crate version
* Small fix
* Remove version info from adapter metadata
* Check but don't install rust toolchain in build script
* Bless after rebase
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
* Update the wasi_testsuite submodule
This commit updates the wasi_testsuite submodule which we haven't
updated in a little over a year and applies a few small fixes but mostly
ignores new tests.
* Add another ignore#
* Un-nest exports in a component
This commit flattens the representation of exports in a component to
make them more easily indexable without forcing traversal through the
hierarchy of instance imports/exports to get there.
* Guarantee type information on component exports
Don't have it optional in some cases and present in others, instead
ensure there's type information for all component exports immediately
available.
* Refactor how component instance exports are loaded
This commit is a change to Wasmtime's public API for
`wasmtime::component::Instance` that reorganizes how component exports
are loaded. Previously there was a system where `Instance::exports()`
was called that that was sort of "iterated over" in a builder-style
pattern to acquire the actual export desired. This required lifetime
trickery for nested instances and some unfortunate API bloat. The major
downside of this approach is that it requires unconditional string
lookups at runtime for exports and additionally does not serve as a
great place to implement the semver-compatible logic of #8395. The goal
of this refactoring is to pave the way to improving this.
The new APIs for loading exports now look a bit more similar to what's
available for core modules. Notably there's a new
`Component::export_index` method which enables performing a string
lookup and returning an index. This index can in turn be passed to
`Instance::get_*` to skip the string lookup when exports are loaded. The
`Instance::exports` API is then entirely removed and dismantled.
The only piece remaining is the ability to load nested exports which is
done through an `Option` parameter to `Component::export_index`. The
way to load a nested instance is now to first lookup the instance with
`None` as this parameter an then the instance itself is `Some` to look
up an export of that instance. This removes the need for a
recursive-style lifetime-juggling API from wasmtime and in theory helps
simplify the usage of loading exports.
* Update `bindgen!` generated structures for exports
This commit updates the output of `bindgen!` to have a different setup
for exports of worlds to handle the changes from the previous commit.
This introduces new `*Pre` structures which are generated alongside the
existing `Guest` structures for example. The `*Pre` versions contain
`ComponentExportIndex` from the previous commit and serve as a path to
accelerating instantiation because all name lookups are skipped.
* Update test expectations for `bindgen!`-generated output
* Review comments
* Fix doc link
* wasi-nn: remove some unncecessary panics from test programs
* Make `libtest-mimic` a workspace dependency
* wasi-nn: use \`libtest-mimic\` for testing
wasi-nn's testing story is complicated by different levels of support on
different platforms (some backends work on certain architectures, others
only work on certain OSes, etc.). This change migrates the `testing`
module, which was included in `src`, to exist solely under `tests`. It
also dynamically checks whether each test is runnable and then chooses
whether to ignore it with a `libtest-mimic` flag. This ensures we can
see all the tests all the time and whether they are running or not,
which is helpful during development.
* Refactor for more subtle `ignore` behavior
On any development machine, with no prior setup, we should be able to
compile and move past the ignored tests without issue:
```console
$ cargo test -- --quiet
running 4 tests
iiii
```
With the proper setup and enabling the right features, tests that are
able to run should do so (eliding a bunch of test output):
```console
$ cargo test --all-features -- --quiet
running 4 tests
iii.
```
On CI, tests that _should_ pass will fail if they can't run:
```console
$ CI=1 cargo test --all-features -- --quiet
iFF.
```
prtest:full
* Add missing `use`
* fix: share download lock between checks
* fix: typo, winml usedx preloaded model
* fix: revert to previous winml behavior
This test was reusing the ONNX test for some reason.
* fix: fully qualify bail!
* Inherit Linux semantics for `fd_pwrite` with `O_APPEND`
This commit updates the implementation of `fd_pwrite` in WASI to match
Linux semantics for an under-specified corner of WASI. Specifically if
`fd_pwrite` is used the offset specified is ignored if the file is
opened in append mode and the bytes are instead appended.
This commit additionally refactors `fd_write` and `fd_pwrite` to have
basically the same code with only a minor branch internally when the
final write is being performed to help deduplicate more logic.
Closes#8817
* Ignore new tests on macos
prtest:full
* Update ignore to all non-linux
Looks like wasi-libc is testing for the READDIR right in addition to
the READ right in the reported flags. Update write-only files to remove
both the READ and READDIR rights accordingly.
Closes#8816
* Force some more permission checks with 0-length writes
When a 0-length write is performed try to send the write all the way to
the underlying file descriptor to at least check that it's valid to
write.
Closes#8818
* Update crates/test-programs/src/bin/preview1_file_write.rs
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* Allow a second error for Windows as well
---------
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* Disable memory protection keys by default at compile time
This commit gates memory protection keys behind a new Cargo feature
which is disabled by default. Memory protection keys are already
disabled by default on all platforms and are only configured to possibly
work with Linux x64. When enabled, however, it unconditionally adds a
small amount of overhead to WebAssembly entries/exits even if the
feature is disabled at runtime for the same reason that the `call-hook`
feature adds overhead. With `call-hook` being disabled by default
in #8808 it seemed reasonable to additionally gate memory protection
keys to avoid needing to disable features in Wasmtime to get the best
performance wasm<->host calls.
* Enable Wasmtime feature for fuzzing
* Disable `call-hook` crate feature by default
This commit disables the `call-hook` feature for the Wasmtime crate
added in #8795 by default. The rationale is that this has a slight cost
to all embeddings even if the feature isn't used and it's not expected
to be that widely used of a feature, so off-by-default seems like a more
appropriate default.
* Enable all features in doc build
* More doc fixes
After https://github.com/bytecodealliance/wasmtime/pull/8809, the mutator cannot
resume from a trap so we don't need to consider them safepoints, as no
GC-managed references are live after the trap. The one exception being the
`debugtrap` CLIF instruction, which is technically still a resumable trap, but
which exists only for emitting the equivalent of an `int3` breakpoint
instruction for pausing in a debugger to inspect state, and should never be used
for mutator-collector interactions.