* Support disabling backtraces at compile time
This commit adds support to Wasmtime to disable, at compile time, the
gathering of backtraces on traps. The `wasmtime` crate now sports a
`wasm-backtrace` feature which, when disabled, will mean that backtraces
are never collected at compile time nor are unwinding tables inserted
into compiled objects.
The motivation for this commit stems from the fact that generating a
backtrace is quite a slow operation. Currently backtrace generation is
done with libunwind and `_Unwind_Backtrace` typically found in glibc or
other system libraries. When thousands of modules are loaded into the
same process though this means that the initial backtrace can take
nearly half a second and all subsequent backtraces can take upwards of
hundreds of milliseconds. Relative to all other operations in Wasmtime
this is extremely expensive at this time. In the future we'd like to
implement a more performant backtrace scheme but such an implementation
would require coordination with Cranelift and is a big chunk of work
that may take some time, so in the meantime if embedders don't need a
backtrace they can still use this option to disable backtraces at
compile time and avoid the performance pitfalls of collecting
backtraces.
In general I tried to originally make this a runtime configuration
option but ended up opting for a compile-time option because `Trap::new`
otherwise has no arguments and always captures a backtrace. By making
this a compile-time option it was possible to configure, statically, the
behavior of `Trap::new`. Additionally I also tried to minimize the
amount of `#[cfg]` necessary by largely only having it at the producer
and consumer sites.
Also a noteworthy restriction of this implementation is that if
backtrace support is disabled at compile time then reference types
support will be unconditionally disabled at runtime. With backtrace
support disabled there's no way to trace the stack of wasm frames which
means that GC can't happen given our current implementation.
* Always enable backtraces for the C API
A recently discovered fuzz bug reported a difference in execution result
between Wasmtime and v8. The module in question had an exported function
which had multiple results, including floats. About a year ago I
reported a bug on v8 about functions with multiple results leading to
incorrect results, and it was fixed many months back but I don't think
that we ever actually pulled in that update. After updating v8 I saw the
test failure go away, so this update is being done to fix the fuzz bug
test failure I saw.
* Delete historical interruptable support in Wasmtime
This commit removes the `Config::interruptable` configuration along with
the `InterruptHandle` type from the `wasmtime` crate. The original
support for adding interruption to WebAssembly was added pretty early on
in the history of Wasmtime when there was no other method to prevent an
infinite loop from the host. Nowadays, however, there are alternative
methods for interruption such as fuel or epoch-based interruption.
One of the major downsides of `Config::interruptable` is that even when
it's not enabled it forces an atomic swap to happen when entering
WebAssembly code. This technically could be a non-atomic swap if the
configuration option isn't enabled but that produces even more branch-y
code on entry into WebAssembly which is already something we try to
optimize. Calling into WebAssembly is on the order of a dozens of
nanoseconds at this time and an atomic swap, even uncontended, can add
up to 5ns on some platforms.
The main goal of this PR is to remove this atomic swap on entry into
WebAssembly. This is done by removing the `Config::interruptable` field
entirely, moving all existing consumers to epochs instead which are
suitable for the same purposes. This means that the stack overflow check
is no longer entangled with the interruption check and perhaps one day
we could continue to optimize that further as well.
Some consequences of this change are:
* Epochs are now the only method of remote-thread interruption.
* There are no more Wasmtime traps that produces the `Interrupted` trap
code, although we may wish to move future traps to this so I left it
in place.
* The C API support for interrupt handles was also removed and bindings
for epoch methods were added.
* Function-entry checks for interruption are a tiny bit less efficient
since one check is performed for the stack limit and a second is
performed for the epoch as opposed to the `Config::interruptable`
style of bundling the stack limit and the interrupt check in one. It's
expected though that this is likely to not really be measurable.
* The old `VMInterrupts` structure is renamed to `VMRuntimeLimits`.
When using the pooling instance allocator for running spec tests we have
to make sure that the configured limits of the allocator aren't so low
as to cause the spec tests to fail due to resource exhaustion issues or
similar. This commit adds in minimum thresholds for instance size as
well as instance count. While here this goes ahead and refactors
everything here to look similar.
This commit removes the currently existing laziness-via-`OnceCell` when
a `Module` is created for creating a `ModuleMemoryImages` data
structure. Processing of data is now already shifted to compile time for
the wasm module which means that creating a `ModuleMemoryImages` is
either cheap because the module is backed by a file on disk, it's a
single `write` into the kernel to a memfd, or it's cheap as it's not
supported. This should help make module instantiation time more
deterministic, even for the first instantiation of a module.
Currently, the use of the downcast method means that you have to use one
of the hard-coded types. But Enarx needs to define its own `WasiFile`
implementations. This works fine, except the resulting files cannot be
used in poll because they aren't part of the hard-coded list.
Replace this with an accessor method for the pollable type in
`WasiFile`. Because we provide a default implementation of the method
and manually implement it on all the hard-coded types, this is backwards
compatible.
Signed-off-by: Nathaniel McCallum <nathaniel@profian.com>
1. This makes it easier for implementors to deal with internal APIs.
2. This matches the signatures of the WASI Snapshot traits.
Although it is likely true that these methods would have to become
immutable in order to implement threading efficiently, threading will
impact a large number of existing traits. So this change is practical
for now with an already-unavoidable change required for threading.
Signed-off-by: Nathaniel McCallum <nathaniel@profian.com>
Previously (as in an hour ago) #3905 landed a new ability for fuzzing to
arbitrarily insert padding between functions. Running some fuzzers
locally though this instantly hit a lot of problems on AArch64 because
the arbitrary padding isn't aligned to 4 bytes like all other functions
are. To fix this issue appending functions now correctly aligns the
output as appropriate for the platform. The alignment argument for
appending was switched to `None` where `None` means "use the platform
default" and otherwise and explicit alignment can be specified for
inserting other data (like arbitrary padding or Windows unwind tables).
* fuzz: Fuzz padding between compiled functions
This commit hooks up the custom
`wasmtime_linkopt_padding_between_functions` configuration option to the
cranelift compiler into the fuzz configuration, enabling us to ensure
that randomly inserting a moderate amount of padding between functions
shouldn't tamper with any results.
* fuzz: Fuzz the `Config::generate_address_map` option
This commit adds fuzz configuration where `generate_address_map` is
either enabled or disabled, unlike how it's always enabled for fuzzing
today.
* Remove unnecessary handling of relocations
This commit removes a number of bits and pieces all related to handling
relocations in JIT code generated by Wasmtime. None of this is necessary
nowadays that the "old backend" has been removed (quite some time ago)
and relocations are no longer expected to be in the JIT code at all.
Additionally with the minimum x86_64 features required to run wasm code
it should be expected that no libcalls are required either for
Wasmtime-based JIT code.
A recent fuzz failure showed that at least `call.wast` requires a memory
larger than 10 pages, so increase the minimum number of pages that can
be used for executing spec tests.
* Implement runtime checks for compilation settings
This commit fills out a few FIXME annotations by implementing run-time
checks that when a `Module` is created it has compatible codegen
settings for the current host (as `Module` is proof of "this code can
run"). This is done by implementing new `Engine`-level methods which
validate compiler settings. These settings are validated on
`Module::new` as well as when loading serialized modules.
Settings are split into two categories, one for "shared" top-level
settings and one for ISA-specific settings. Both categories now have
allow-lists hardcoded into `Engine` which indicate the acceptable values
for each setting (if applicable). ISA-specific settings are checked with
the Rust standard library's `std::is_x86_feature_detected!` macro. Other
macros for other platforms are not stable at this time but can be added
here if necessary.
Closes#3897
* Fix fall-through logic to actually be correct
* Use a `OnceCell`, not an `AtomicBool`
* Fix some broken tests
In some use cases it is desirable to provide a custom snapshot WASI
context. Facilitate this by depending on a combination of traits
required rather than concrete type in the signature.
Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>
If either end stack overflows we can't validate the other side since the
other side, depending on codegen settings, may have been successful, hit
a different trap, or also stack overflowed.
This commit fixes calling the call hooks configured in a store for host
functions defined with `Func::new_unchecked` or similar. I believe that
this was just an accidental oversight and there's no fundamental reason
to not support this.
Another instance similar to #3879 where when doing differential tests
the pooling allocator configuration needs to be updated to allow for a
possible table.
* Move spec interpreter fuzzing behind a Cargo feature
Building the spec interpreter requires a local installation of Ocaml and
now libgmp which isn't always available, so this enables the ability to
disable building the spec interpreter by using `cargo +nightly fuzz
build --no-default-features`. The spec interpreter is still built by
default but if fuzzers are being built locally and the spec interpreter
isn't needed then this should enable it to be relatively easily
opted-out of.
* Tweak manifest directives
This commit removes some `.unwrap()` annotations around casts between
integers to either be infallible or handle errors. This fixes a panic in
a fuzz test case that popped out for memory64-using modules. The actual
issue here is pretty benign, we were just too eager about assuming
things fit into 32-bit.
The recent removal of `ModuleLimits` meant that the update to the
fuzzers could quickly fail where the instance size limit was set to
something small (like 0) and then nothing would succeed in compilation.
This allows the modules to fail to compile and then continues to the
next fuzz input in these situations.
This seems to have intended to allow overrides but the specific Makefile
syntax used didn't actually allow overrides, so update that to allow env
vars from the outside world to override the variable (needed locally on
AArch64 I'm building on which has a different path to libgmp)
Spec tests require multi-value to be enabled and wasm-smith recently
made this a fuzz-input option, so override the fuzz input as we do for
other features and force-enable multi-value.
This commit updates the build script which clones the spec interpreter
for fuzzing to specifically pin at a hardcoded revision. This is
intended at improving reproducibility if we hit any issues while fuzzing
to ensure that the same wasmtime revision is always using the same spec
interpreter revision.
This commit aims to achieve the goal of being able to run the test suite
on Windows with `--test-threads 1`, or more notably allowing Wasmtime's
defaults to work better with the main thread on Windows which appears to
have a smaller stack by default than Linux by comparison. In decreasing
the default wasm stack size a test is also update to probe for less
stack to work on Windows' main thread by default, ideally allowing the
full test suite to work with `--test-threads 1` (although this isn't
added to CI as it's not really critical).
Closes#3857
* Shrink the size of the anyfunc table in `VMContext`
This commit shrinks the size of the `VMCallerCheckedAnyfunc` table
allocated into a `VMContext` to be the size of the number of "escaped"
functions in a module rather than the number of functions in a module.
Escaped functions include exports, table elements, etc, and are
typically an order of magnitude smaller than the number of functions in
general. This should greatly shrink the `VMContext` for some modules
which while we aren't necessarily having any problems with that today
shouldn't cause any problems in the future.
The original motivation for this was that this came up during the recent
lazy-table-initialization work and while it no longer has a direct
performance benefit since tables aren't initialized at all on
instantiation it should still improve long-running instances
theoretically with smaller `VMContext` allocations as well as better
locality between anyfuncs.
* Fix some tests
* Remove redundant hash set
* Use a helper for pushing function type information
* Use a more descriptive `is_escaping` method
* Clarify a comment
* Fix condition
* Remove the `ModuleLimits` pooling configuration structure
This commit is an attempt to improve the usability of the pooling
allocator by removing the need to configure a `ModuleLimits` structure.
Internally this structure has limits on all forms of wasm constructs but
this largely bottoms out in the size of an allocation for an instance in
the instance pooling allocator. Maintaining this list of limits can be
cumbersome as modules may get tweaked over time and there's otherwise no
real reason to limit the number of globals in a module since the main
goal is to limit the memory consumption of a `VMContext` which can be
done with a memory allocation limit rather than fine-tuned control over
each maximum and minimum.
The new approach taken in this commit is to remove `ModuleLimits`. Some
fields, such as `tables`, `table_elements` , `memories`, and
`memory_pages` are moved to `InstanceLimits` since they're still
enforced at runtime. A new field `size` is added to `InstanceLimits`
which indicates, in bytes, the maximum size of the `VMContext`
allocation. If the size of a `VMContext` for a module exceeds this value
then instantiation will fail.
This involved adding a few more checks to `{Table, Memory}::new_static`
to ensure that the minimum size is able to fit in the allocation, since
previously modules were validated at compile time of the module that
everything fit and that validation no longer happens (it happens at
runtime).
A consequence of this commit is that Wasmtime will have no built-in way
to reject modules at compile time if they'll fail to be instantiated
within a particular pooling allocator configuration. Instead a module
must attempt instantiation see if a failure happens.
* Fix benchmark compiles
* Fix some doc links
* Fix a panic by ensuring modules have limited tables/memories
* Review comments
* Add back validation at `Module` time instantiation is possible
This allows for getting an early signal at compile time that a module
will never be instantiable in an engine with matching settings.
* Provide a better error message when sizes are exceeded
Improve the error message when an instance size exceeds the maximum by
providing a breakdown of where the bytes are all going and why the large
size is being requested.
* Try to fix test in qemu
* Flag new test as 64-bit only
Sizes are all specific to 64-bit right now
This commit fixes a potential issue where the fast-path instantiate in
`MemoryImageSlot` where when the previous image is compared against the
new image it only performed file descriptor equality, but nowadays with
loading images from `*.cwasm` files there might be multiple images in
the same file so the offsets also need to be considered. I think this
isn't really easy to hit today, it would require combining both module
linking and multi-memory which gets into the realm of being pretty
esoteric so I haven't added a test case here for this.
Without async fuzzing, we won't be able to test the most interesting
aspects of epoch interruption, namely the
interrupt/update-deadline/resume flow. However, the "trap on epoch
change" behavior works even for synchronous stores, so we can fuzz with
this the same way we fuzz with the interrupt flag.
* fuzzing: Add a custom mutator based on `wasm-mutate`
* fuzz: Add a version of the `compile` fuzz target that uses `wasm-mutate`
* Update `wasmparser` dependencies
* Panic on resetting image slots back to anonymous memory
This commit updates `Drop for MemoryImageSlot` to panic instead of
ignoring errors when resetting memory back to a clean slate. On reading
some of this code again for a different change I realized that if an
error happens in `reset_with_anon_memory` it would be possible,
depending on where another error happened, to leak memory from one image
to another.
For example if `clear_and_remain_ready` failed its `madvise` (for
whatever reason) and didn't actually reset any memory, then if `Drop for
MemoryImageSlot` also hit an error trying to remap memory (for whatever
reason), then nothing about memory has changed and when the
`MemoryImageSlot` is recreated it'll think that it's 0-length when
actually it's a bit larger and may leak data.
I don't think this is a serious problem since we don't know any
situation under which the `madvise` would fail and/or the resetting with
anonymous memory, but given that these aren't expected to fail I figure
it's best to be a bit more defensive here and/or loud about failures.
* Update a comment
* Enable copy-on-write heap initialization by default
This commit enables the `Config::memfd` feature by default now that it's
been fuzzed for a few weeks on oss-fuzz, and will continue to be fuzzed
leading up to the next release of Wasmtime in early March. The
documentation of the `Config` option has been updated as well as adding
a CLI flag to disable the feature.
* Remove ubiquitous "memfd" terminology
Switch instead to forms of "memory image" or "cow" or some combination
thereof.
* Update new option names
Ended up being a routine update but seemed good to go ahead and hook up
updates. While I was at it I went ahead and hooked up multi-value
swarm fuzzing as well now that wasm-smith implements it.
* Enable SSE 4.2 unconditionally
Fuzzing over the weekend found that `i64x2` comparison operators
require `pcmpgtq` which is an SSE 4.2 instruction. Along the lines of #3816
this commit unconditionally enables and requires SSE 4.2 for compilation
and fuzzing. It will no longer be possible to create a compiler for
x86_64 with simd enabled if SSE 4.2 is disabled.
* Update comment
In #3820 we see an issue with the new heuristics that control use of
memfd: it's entirely possible for a reasonable Wasm module produced by a
snapshotting system to have a relatively sparse heap (less than 50%
filled). A system that avoids memfd because of this would have an
undesirable performance reduction on such modules.
Ultimately we should try to implement a hybrid scheme where we support
outlier/leftover initializers, but for now this PR makes the "always
allow dense" limit configurable. This way, embedders that want to ensure
that memfd is used can do so, if they have other knowledge about the
maximum heap size allowed in their system.
(Partially addresses #3820 but let's leave it open to track the hybrid
idea)
* x64: enable VTune support by default
After significant work in the `ittapi-rs` crate, this dependency should
build without issue on Wasmtime's supported operating systems: Windows,
Linux, and macOS. The difference in the release binary is <20KB, so this
change makes `vtune` a default build feature. This change upgrades
`ittapi-rs` to v0.2.0 and updates the documentation.
* review: add configuration for defaults in more places
* review: remove OS conditional compilation, add architecture
* review: do not default vtune feature in wasmtime-jit
* Fix typo
* Move vmoffset field size and field name together
The previous code was quite confusing about what applied to which field.
The new code also makes it easier to move fields around and insert and
delete fields.
* Move builtin_functions before all variable sized fields
This allows the offset to be calculated at compile time
* Add cadd and cmul convenience functions
* Remove comment
* Change fields! syntax as per review
* Add implicit u32::from to fields!
This commit fixes the spectests fuzz target to set a lower bound on the
arbitrary pooling allocator configurations of 10 memory pages so that the limit
doesn't interfere with what's required in the spec tests.
Per-`Store` allocations are already limited with the `StoreLimits`
structure while fuzzing to ensure fuzz targets don't allocate more than
1GB of memory, but the `instantiate-many` fuzzer created many separate
stores which each had their own limit, meaning that the 2GB limit of
fuzzing could be pretty easily reached.
This commit fixes the issue by making `StoreLimits` a shareable type via
`Rc` to ensure the same limits can be applied to all stores created
within a fuzz run, globally limiting the memory even across stores to 1GB.
In #3800 I added support to consume fuzz input as selection of whether
or not target features should be enabled. This was done in a
platform-specific manner, however, which means that I can no longer
reliably take the fuzz reproducer cases from oss-fuzz and reproduce them
locally on an aarch64 machine. This commit fixes this problem by
unconditionally pulling bytes from the input for fuzz features,
irrespective of the host platform. Features are then discarded if
they're not applicable.