* Add an initial wasi-nn implementation for Wasmtime
This change adds a crate, `wasmtime-wasi-nn`, that uses `wiggle` to expose the current state of the wasi-nn API and `openvino` to implement the exposed functions. It includes an end-to-end test demonstrating how to do classification using wasi-nn:
- `crates/wasi-nn/tests/classification-example` contains Rust code that is compiled to the `wasm32-wasi` target and run with a Wasmtime embedding that exposes the wasi-nn calls
- the example uses Rust bindings for wasi-nn contained in `crates/wasi-nn/tests/wasi-nn-rust-bindings`; this crate contains code generated by `witx-bindgen` and eventually should be its own standalone crate
* Test wasi-nn as a CI step
This change adds:
- a GitHub action for installing OpenVINO
- a script, `ci/run-wasi-nn-example.sh`, to run the classification example
* this requires upgrading to wasmparser 0.67.0.
* There are no CLIF side changes because the CLIF `select` instruction is
polymorphic enough.
* on aarch64, there is unfortunately no conditional-move (csel) instruction on
vectors. This patch adds a synthetic instruction `VecCSel` which *does*
behave like that. At emit time, this is emitted as an if-then-else diamond
(4 insns).
* aarch64 implementation is otherwise straightforwards.
I don't think this has happened in awhile but I've run a `cargo update`
as well as trimming some of the duplicate/older dependencies in
`Cargo.lock` by updating some of our immediate dependencies as well.
This patch implements, for aarch64, the following wasm SIMD extensions
i32x4.dot_i16x8_s instruction
https://github.com/WebAssembly/simd/pull/127
It also updates dependencies as follows, in order that the new instruction can
be parsed, decoded, etc:
wat to 1.0.27
wast to 26.0.1
wasmparser to 0.65.0
wasmprinter to 0.2.12
The changes are straightforward:
* new CLIF instruction `widening_pairwise_dot_product_s`
* translation from wasm into `widening_pairwise_dot_product_s`
* new AArch64 instructions `smull`, `smull2` (part of the `VecRRR` group)
* translation from `widening_pairwise_dot_product_s` to `smull ; smull2 ; addv`
There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.
* Validate modules while translating
This commit is a change to cranelift-wasm to validate each function body
as it is translated. Additionally top-level module translation functions
will perform module validation. This commit builds on changes in
wasmparser to perform module validation interwtwined with parsing and
translation. This will be necessary for future wasm features such as
module linking where the type behind a function index, for example, can
be far away in another module. Additionally this also brings a nice
benefit where parsing the binary only happens once (instead of having an
up-front serial validation step) and validation can happen in parallel
for each function.
Most of the changes in this commit are plumbing to make sure everything
lines up right. The major functional change here is that module
compilation should be faster by validating in parallel (or skipping
function validation entirely in the case of a cache hit). Otherwise from
a user-facing perspective nothing should be that different.
This commit does mean that cranelift's translation now inherently
validates the input wasm module. This means that the Spidermonkey
integration of cranelift-wasm will also be validating the function as
it's being translated with cranelift. The associated PR for wasmparser
(bytecodealliance/wasmparser#62) provides the necessary tools to create
a `FuncValidator` for Gecko, but this is something I'll want careful
review for before landing!
* Read function operators until EOF
This way we can let the validator take care of any issues with
mismatched `end` instructions and/or trailing operators/bytes.
This commit extracts the two implementations of `Compiler` into two
separate crates, `wasmtime-cranelfit` and `wasmtime-lightbeam`. The
`wasmtime-jit` crate then depends on these two and instantiates them
appropriately. The goal here is to start reducing the weight of the
`wasmtime-environ` crate, which currently serves as a common set of
types between all `wasmtime-*` crates. Long-term I'd like to remove the
dependency on Cranelift from `wasmtime-environ`, but that's going to
take a lot more work.
In the meantime I figure it's a good way to get started by separating
out the lightbeam/cranelift function compilers from the
`wasmtime-environ` crate. We can continue to iterate on moving things
out in the future, too.
This commit moves all of the caching support that currently lives in
`wasmtime-environ` into a `wasmtime-cache` crate and makes it optional. The
goal here is to slim down the `wasmtime-environ` crate and clearly separate
boundaries where caching is a standalone and optional feature, not intertwined
with other crates.
When running in embedded environments, threads creation is sometimes
undesirable. This adds a feature to toggle wasmtime's internal thread
creation for parallel compilation.
This somewhat cuts down on duplicate dependencies. `wast` is used in a much older version (`11.0.0`) by `witx`, and can be updated without issues there as well, but this at least gets us from 3 copies to 2.
This introduces two changes:
- first, a Cargo feature is added to make it possible to use the
Cranelift x64 backend directly from wasmtime's CLI.
- second, when passing a `cranelift-flags` parameter, and the given
parameter's name doesn't exist at the target-independent flag level, try
to set it as a target-dependent setting.
These two changes make it possible to try out the new x64 backend with:
cargo run --features experimental_x64 -- run --cranelift-flags use_new_backend=true -- /path/to/a.wasm
Right now, this will fail because most opcodes required by the
trampolines are actually not implemented yet.
For host VM code, we use plain reference counting, where cloning increments
the reference count, and dropping decrements it. We can avoid many of the
on-stack increment/decrement operations that typically plague the
performance of reference counting via Rust's ownership and borrowing system.
Moving a `VMExternRef` avoids mutating its reference count, and borrowing it
either avoids the reference count increment or delays it until if/when the
`VMExternRef` is cloned.
When passing a `VMExternRef` into compiled Wasm code, we don't want to do
reference count mutations for every compiled `local.{get,set}`, nor for
every function call. Therefore, we use a variation of **deferred reference
counting**, where we only mutate reference counts when storing
`VMExternRef`s somewhere that outlives the activation: into a global or
table. Simultaneously, we over-approximate the set of `VMExternRef`s that
are inside Wasm function activations. Periodically, we walk the stack at GC
safe points, and use stack map information to precisely identify the set of
`VMExternRef`s inside Wasm activations. Then we take the difference between
this precise set and our over-approximation, and decrement the reference
count for each of the `VMExternRef`s that are in our over-approximation but
not in the precise set. Finally, the over-approximation is replaced with the
precise set.
The `VMExternRefActivationsTable` implements the over-approximized set of
`VMExternRef`s referenced by Wasm activations. Calling a Wasm function and
passing it a `VMExternRef` moves the `VMExternRef` into the table, and the
compiled Wasm function logically "borrows" the `VMExternRef` from the
table. Similarly, `global.get` and `table.get` operations clone the gotten
`VMExternRef` into the `VMExternRefActivationsTable` and then "borrow" the
reference out of the table.
When a `VMExternRef` is returned to host code from a Wasm function, the host
increments the reference count (because the reference is logically
"borrowed" from the `VMExternRefActivationsTable` and the reference count
from the table will be dropped at the next GC).
For more general information on deferred reference counting, see *An
Examination of Deferred Reference Counting and Cycle Detection* by Quinane:
https://openresearch-repository.anu.edu.au/bitstream/1885/42030/2/hon-thesis.pdf
cc #929Fixes#1804
The `wasmtime` crate currently lives in `crates/api` for historical
reasons, because we once called it `wasmtime-api` crate. This creates a
stumbling block for new contributors.
As discussed on Zulip, rename the directory to `crates/wasmtime`.
* Implement interrupting wasm code, reimplement stack overflow
This commit is a relatively large change for wasmtime with two main
goals:
* Primarily this enables interrupting executing wasm code with a trap,
preventing infinite loops in wasm code. Note that resumption of the
wasm code is not a goal of this commit.
* Additionally this commit reimplements how we handle stack overflow to
ensure that host functions always have a reasonable amount of stack to
run on. This fixes an issue where we might longjmp out of a host
function, skipping destructors.
Lots of various odds and ends end up falling out in this commit once the
two goals above were implemented. The strategy for implementing this was
also lifted from Spidermonkey and existing functionality inside of
Cranelift. I've tried to write up thorough documentation of how this all
works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are.
A brief summary of how this works is that each function and each loop
header now checks to see if they're interrupted. Interrupts and the
stack overflow check are actually folded into one now, where function
headers check to see if they've run out of stack and the sentinel value
used to indicate an interrupt, checked in loop headers, tricks functions
into thinking they're out of stack. An interrupt is basically just
writing a value to a location which is read by JIT code.
When interrupts are delivered and what triggers them has been left up to
embedders of the `wasmtime` crate. The `wasmtime::Store` type has a
method to acquire an `InterruptHandle`, where `InterruptHandle` is a
`Send` and `Sync` type which can travel to other threads (or perhaps
even a signal handler) to get notified from. It's intended that this
provides a good degree of flexibility when interrupting wasm code. Note
though that this does have a large caveat where interrupts don't work
when you're interrupting host code, so if you've got a host import
blocking for a long time an interrupt won't actually be received until
the wasm starts running again.
Some fallout included from this change is:
* Unix signal handlers are no longer registered with `SA_ONSTACK`.
Instead they run on the native stack the thread was already using.
This is possible since stack overflow isn't handled by hitting the
guard page, but rather it's explicitly checked for in wasm now. Native
stack overflow will continue to abort the process as usual.
* Unix sigaltstack management is now no longer necessary since we don't
use it any more.
* Windows no longer has any need to reset guard pages since we no longer
try to recover from faults on guard pages.
* On all targets probestack intrinsics are disabled since we use a
different mechanism for catching stack overflow.
* The C API has been updated with interrupts handles. An example has
also been added which shows off how to interrupt a module.
Closes#139Closes#860Closes#900
* Update comment about magical interrupt value
* Store stack limit as a global value, not a closure
* Run rustfmt
* Handle review comments
* Add a comment about SA_ONSTACK
* Use `usize` for type of `INTERRUPTED`
* Parse human-readable durations
* Bring back sigaltstack handling
Allows libstd to print out stack overflow on failure still.
* Add parsing and emission of stack limit-via-preamble
* Fix new example for new apis
* Fix host segfault test in release mode
* Fix new doc example
* Move most wasmtime tests into one test suite
This commit moves most wasmtime tests into a single test suite which
gets compiled into one executable instead of having lots of test
executables. The goal here is to reduce disk space on CI, and this
should be achieved by having fewer executables which means fewer copies
of `libwasmtime.rlib` linked across binaries on the system. More
importantly though this means that DWARF debug information should only
be in one executable rather than duplicated across many.
* Share more build caches
Globally set `RUSTFLAGS` to `-Dwarnings` instead of individually so all
build steps share the same value.
* Allow some dead code in cranelift-codegen
Prevents having to fix all warnings for all possible feature
combinations, only the main ones which come up.
* Update some debug file paths
* Wasmtime 0.15.0 and Cranelift 0.62.0. (#1398)
* Bump more ad-hoc versions.
* Add build.rs to wasi-common's Cargo.toml.
* Update the env var name in more places.
* Remove a redundant echo.
This patch adds initial support for ittapi which is an open
source profiling api for instrumentation and tracing and profiling
of jitted code. Result files can be read by VTune for analysis
Build:
cargo build --features=vtune
Profile: // Using amplxe-cl from VTune
amplxe-cl -v -collect hostpost target/debug/wasmtime --vtune test.wasm
* Remove the wasmtime Python extension from this repo
This commit removes the `crates/misc/py` folder and all associated
doo-dads like CI. This module has been rewritten to use the C API
natively and now lives at
https://github.com/bytecodealliance/wasmtime-py as discussed on #1390
* Bump Wasmtime to 0.14.0.
* Update the publish script for the wiggle crate wiggle.
* More fixes.
* Fix lightbeam depenency version.
* cargo update
* Cargo update wasi-tests too.
And add cargo update to the version-bump scripts.
* Publishing fixes.
* Make WASI a symlink.
* More fixes.
* Cargo doesn't allow dev-dependencies to have optional features.
* Remove the symlink.
* Add WASI as another git submodule.
* Add examples of linking and WASI
This commit adds two example programs, one for linking two modules
together and one for instantiating WASI. The linkage example
additionally uses WASI to get some meaningful output at this time.
cc #1272
* Add examples to the book as well
* More links!
* Ignore examples from rustdoc testsing
* More example updates
* More ignored
* Enable jitdump profiling support by default
This the result of some of the investigation I was doing for #1017. I've
done a number of refactorings here which culminated in a number of
changes that all amount to what I think should result in jitdump support being
enabled by default:
* Pass in a list of finished functions instead of just a range to
ensure that we're emitting jit dump data for a specific module rather
than a whole `CodeMemory` which may have other modules.
* Define `ProfilingStrategy` in the `wasmtime` crate to have everything
locally-defined
* Add support to the C API to enable profiling
* Documentation added for profiling with jitdump to the book
* Split out supported/unsupported files in `jitdump.rs` to avoid having
lots of `#[cfg]`.
* Make dependencies optional that are only used for `jitdump`.
* Move initialization up-front to `JitDumpAgent::new()` instead of
deferring it to the first module.
* Pass around `Arc<dyn ProfilingAgent>` instead of
`Option<Arc<Mutex<Box<dyn ProfilingAgent>>>>`
The `jitdump` Cargo feature is now enabled by default which means that
our published binaries, C API artifacts, and crates will support
profiling at runtime by default. The support I don't think is fully
fleshed out and working but I think it's probably in a good enough spot
we can get users playing around with it!
Removes an edge towards the `synstructure` dependency in our dependency
graph which, if removed entirely, is hoped to reduce build times locally
and on CI as you switch between `cargo build` and `cargo test` because
deep dependencies like `syn` won't have their features alternating back
and forth.
* Temporarily remove support for interface types
This commit temporarily removes support for interface types from the
`wasmtime` CLI and removes the `wasmtime-interface-types` crate. An
error is now printed for any input wasm modules that have wasm interface
types sections to indicate that support has been removed and references
to two issues are printed as well:
* #677 - tracking work for re-adding interface types support
* #1271 - rationale for removal and links to other discussions
Closes#1271
* Update the python extension
* Move all examples to a top-level directory
This commit moves all API examples (Rust and C) to a top-level
`examples` directory. This is intended to make it more discoverable and
conventional as to where examples are located. Additionally all examples
are now available in both Rust and C to see how to execute the example
in the language you're familiar with. The intention is that as more
languages are supported we'd add more languages as examples here too.
Each example is also accompanied by either a `*.wat` file which is
parsed as input, or a Rust project in a `wasm` folder which is compiled
as input.
A simple driver crate was also added to `crates/misc` which executes all
the examples on CI, ensuring the C and Rust examples all execute
successfully.