* Rename FdEntry to Entry
* Add custom FdSet container for managing fd allocs/deallocs
This commit adds a custom `FdSet` container which is intended
for use in `wasi-common` to track WASI fd allocs/deallocs. The
main aim for this container is to abstract away the current
approach of spawning new handles
```rust
fd = fd.checked_add(1).ok_or(...)?;
```
and to make it possible to reuse unused/reclaimed handles
which currently is not done.
The struct offers 3 methods to manage its functionality:
* `FdSet::new` initialises the internal data structures,
and most notably, it preallocates an `FdSet::BATCH_SIZE`
worth of handles in such a way that we always start popping
from the "smallest" handle (think of it as of reversed stack,
I guess; it's not a binary heap since we don't really care
whether internally the handles are sorted in some way, just that
the "largets" handle is at the bottom. Why will become clear
when describing `allocate` method.)
* `FdSet::allocate` pops the next available handle if one is available.
The tricky bit here is that, if we run out of handles, we preallocate
the next `FdSet::BATCH_SIZE` worth of handles starting from the
latest popped handle (i.e., the "largest" handle). This
works only because we make sure to only ever pop and push already
existing handles from the back, and push _new_ handles (from the
preallocation step) from the front. When we ultimately run out
of _all_ available handles, we then return `None` for the client
to handle in some way (e.g., throwing an error such as `WasiError::EMFILE`
or whatnot).
* `FdSet::deallocate` returns the already allocated handle back to
the pool for further reuse.
When figuring out the internals, I've tried to optimise for both
alloc and dealloc performance, and I believe we've got an amortised
`O(1)~*` performance for both (if my maths is right, and it may very
well not be, so please verify!).
In order to keep `FdSet` fairly generic, I've made sure not to hard-code
it for the current type system generated by `wig` (i.e., `wasi::__wasi_fd_t`
representing WASI handle), but rather, any type which wants to be managed
by `FdSet` needs to conform to `Fd` trait. This trait is quite simple as
it only requires a couple of rudimentary traits (although `std:#️⃣:Hash`
is quite a powerful assumption here!), and a custom method
```rust
Fd::next(&self) -> Option<Self>;
```
which is there to encapsulate creating another handle from the given one.
In the current state of the code, that'd be simply `u32::checked_add(1)`.
When `wiggle` makes it way into the `wasi-common`, I'd imagine it being
similar to
```rust
fn next(&self) -> Option<Self> {
self.0.checked_add(1).map(Self::from)
}
```
Anyhow, I'd be happy to learn your thoughts about this design!
* Fix compilation on other targets
* Rename FdSet to FdPool
* Fix FdPool unit tests
* Skip preallocation step in FdPool
* Replace 'replace' calls with direct assignment
* Reuse FdPool from snapshot1 in snapshot0
* Refactor FdPool::allocate
* Remove entry before deallocating the fd
* Refactor the design to accommodate `u32` as underlying type
This commit refactors the design by ensuring that the underlying
type in `FdPool` which we use to track and represent raw file
descriptors is `u32`. As a result, the structure of `FdPool` is
simplified massively as we no longer need to track the claimed
descriptors; in a way, we trust the caller to return the handle
after it's done with it. In case the caller decides to be clever
and return a handle which was not yet legally allocated, we panic.
This should never be a problem in `wasi-common` unless we hit a
bug.
To make all of this work, `Fd` trait is modified to require two
methods: `as_raw(&self) -> u32` and `from_raw(raw_fd: u32) -> Self`
both of which are used to convert to and from the `FdPool`'s underlying
type `u32`.
* Handle select relocations while generating trampolines
Trampoline generation for all function signatures exposed a preexisting
bug in wasmtime where trampoline generation occasionally does have
relocations, but it's asserted that trampolines don't generate
relocations, causing a panic. The relocation is currently primarily the
probestack function which happens when functions might have a huge
number of parameters, but not so huge as to blow the wasmparser limit of
how many parameters are allowed.
This commit fixes the issue by handling relocations for trampolines in
the same manner as the rest of the code. Note that dynamically-generated
trampolines via the `Func` API still panic if they have too many
arguments and generate a relocation, but it seems like we can try to fix
that later if the need truly arises.
Closes#1322
* Log trampoline relocations
Removes an edge towards the `synstructure` dependency in our dependency
graph which, if removed entirely, is hoped to reduce build times locally
and on CI as you switch between `cargo build` and `cargo test` because
deep dependencies like `syn` won't have their features alternating back
and forth.
* Build wasmtime-c-api differenty in run-examples
This tweaks how the wasmtime-c-api crate is built slightly, changing how
we invoke Cargo. Due to historical Cargo bugs this should help minimize
the number of rebuilds due to features since the feature selection will
be different.
* rustfmt
Until #1306 is resolved (some spilling/regalloc issue with larger FPR register banks), this removes FPR32 support. Only Wasm's `i64x2.mul` was using this register class and that instruction is predicated on AVX512 support; for the time being, that instruction will have to make do with the 16 FPR registers.
The operands of these bitwise instructions could have different types and still be valid Wasm (i.e. `v128`). Because of this, we must tell Cranelift to cast both operands to the same type--the default type, in this case. This undoes the work merged in https://github.com/bytecodealliance/cranelift/pull/1233.
Because the smallest Wasm scalar type is i32, users of the `i8x16.replace_lane` and `i16x8.replace_lane` instructions will only be able to pass `i32` values as operands. These values must be reduced by dropping the upper bits (see https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#replace-lane-value) using Cranelift's `ireduce` instruction.
Because Wasm SIMD vectors store their type as `v128`, there is a mismatch between the more specific types Cranelift uses and Wasm SIMD. Because of this mismatch, Wasm SIMD translates to the default Cranelift type `I8X16`, causing issues when more specific type information is available (e.g. `I32x4`). To fix this, all incoming values to SIMD instructions are checked during translation (not runtime) and if necessary cast from `I8X16` to the appropriate type by functions like `optionally_bitcast_vector`, `pop1_with_bitcast` and `pop2_with_bitcast`. However, there are times when we must also cast to `I8X16` for outgoing values, as with `local.set` and `local.tee`.
There are other ways of resolving this (e.g., see adding a new vector type, https://github.com/bytecodealliance/cranelift/pull/1251) but we discussed staying with this casting approach in https://github.com/bytecodealliance/wasmtime/issues/1147.
... but turn it back on in CI by default. The `binaryen-sys` crate
builds binaryen from source, which is a drag on CI for a few reasons:
* This is quite large and takes a good deal of time to build
* The debug build directory for binaryen is 4GB large
In an effort to both save time and disk space on the builders this
commit adds a `binaryen` feature to the `wasmtime-fuzz` crate. This
feature is enabled specifically when running the fuzzers on CI, but it
is disabled during the typical `cargo test --all` command. This means
that the test builders should save an extra 4G of space and be a bit
speedier now that they don't build a giant wad of C++.
We'll need to update the OSS-fuzz integration to enable the `binaryen`
feature when executing `cargo fuzz build`, and I'll do that once this
gets closer to landing.
the `FaerieProduct` exposes faerie-specific types, so we can give
the `faerie::ArtifactError` on those methods.
`ModuleError::Backend` now expects an `anyhow::Error`, so we change
a .to_string into .into() and retain better error information.
This allows us to retain richer information from backend errors.
We already have `anyhow` as a dep in several places in the wasmtime
tree, and in cranelift-faerie. faerie is the only user of this
variant.
Existing code that puts a String into the Backend error can trivially
adapt their code to emit an anyhow::Error.
This should save us about 3GB of target directory disk space and it may
also be a tiny speed boost. There's no real benefit to using incremental
builds on CI because we're not changing code anyway!
* Store module name on `wasmtime_environ::Module`
This keeps all name information in one place so we dont' have to keep
extra structures around in `wasmtime::Module`.
* rustfmt
It is too easy to run afoul of yaml syntax with all these colons an asterisks,
e.g. we recently broke the bot like this:
``
YAMLException: unidentified alias ".md" at line 58, column 9:
- *.md
^
at generateError (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:8357:10)
at throwError (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:8363:9)
at readAlias (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9466:5)
at composeNode (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9558:20)
at readBlockMapping (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9226:16)
at composeNode (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9549:12)
at readBlockSequence (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9145:5)
at composeNode (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9548:12)
at readBlockMapping (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9279:11)
at composeNode (/home/runner/work/_actions/bytecodealliance/labeler/schedule-fork/dist/index.js:9549:12)
```
* Impl different formatters for flags
Rather than forcing only binary formatting of flags types, how about
we implement all relevant traits (`Binary`, `Octal`, `LowerHex`, and
`UpperHex`) and allow the user to pick the most relevant one for their
use case?
Also, we use at least `Octal` and `LowerHex` in a couple of places
in `wasi-common`.
* fmt::Display for flags now inspired by bitflags
Flags is now by default formatted similarly to how
`bitflags` crate does it, namely, `dsync|append (0x11)`. In case
we're dealing with an empty set, we get `empty (0x0)`. Because of
this, any `Octal`, `LowerHex`, etc., formatters are redundant now.
Furthermore, while here, I've rewritten `EMPTY_FLAGS` and `ALL_FLAGS`
(where the former means `0x0` and the latter is the union of all possible
values) to be `const fn empty()` and `const fn all()` where the latter is
an expanded union of primitive representation values out of a macro.
This is again largely inspired by the `bitflags` crate.
* Test fmt::Display for flags
* Fill out API docs on `wasmtime::Module`
Part of #1272
* Apply suggestions from code review
Co-Authored-By: Nick Fitzgerald <fitzgen@gmail.com>
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Refactor wasmtime_runtime::Export
Instead of an enumeration with variants that have data fields have an
enumeration where each variant has a struct, and each struct has the
data fields. This allows us to store the structs in the `wasmtime` API
and avoid lots of `panic!` calls and various extraneous matches.
* Pre-generate trampoline functions
The `wasmtime` crate supports calling arbitrary function signatures in
wasm code, and to do this it generates "trampoline functions" which have
a known ABI that then internally convert to a particular signature's ABI
and call it. These trampoline functions are currently generated
on-the-fly and are cached in the global `Store` structure. This,
however, is suboptimal for a few reasons:
* Due to how code memory is managed each trampoline resides in its own
64kb allocation of memory. This means if you have N trampolines you're
using N * 64kb of memory, which is quite a lot of overhead!
* Trampolines are never free'd, even if the referencing module goes
away. This is similar to #925.
* Trampolines are a source of shared state which prevents `Store` from
being easily thread safe.
This commit refactors how trampolines are managed inside of the
`wasmtime` crate and jit/runtime internals. All trampolines are now
allocated in the same pass of `CodeMemory` that the main module is
allocated into. A trampoline is generated per-signature in a module as
well, instead of per-function. This cache of trampolines is stored
directly inside of an `Instance`. Trampolines are stored based on
`VMSharedSignatureIndex` so they can be looked up from the internals of
the `ExportFunction` value.
The `Func` API has been updated with various bits and pieces to ensure
the right trampolines are registered in the right places. Overall this
should ensure that all trampolines necessary are generated up-front
rather than lazily. This allows us to remove the trampoline cache from
the `Compiler` type, and move one step closer to making `Compiler`
threadsafe for usage across multiple threads.
Note that as one small caveat the `Func::wrap*` family of functions
don't need to generate a trampoline at runtime, they actually generate
the trampoline at compile time which gets passed in.
Also in addition to shuffling a lot of code around this fixes one minor
bug found in `code_memory.rs`, where `self.position` was loaded before
allocation, but the allocation may push a new chunk which would cause
`self.position` to be zero instead.
* Pass the `SignatureRegistry` as an argument to where it's needed.
This avoids the need for storing it in an `Arc`.
* Ignore tramoplines for functions with lots of arguments
Co-authored-by: Dan Gohman <sunfish@mozilla.com>