This was pioneered in bytecodealliance/wasm-tools#1562 and looks to be
working well so I've copied over similar metadata for Wasmtime. I've
seen the `cargo binstall` method of installing binaries to be somewhat
common and we've already got all the necessary binaries so it seemed
nice to add support. Wasmtime doesn't name the artifacts after Rust
target names so some manual configuration is required here, but
otherwise this hopefully isn't too costly to maintain.
* wasi-http: add the port to authority when opening a TCP connection
* Ignore test on riscv64 and s390x
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
* Implement Component::define_unknown_imports_as_traps
This will search through a components imports, find any imports
that have not yet been defined in the linker and add a definition
which will trap upon being called.
* Address PR feedback
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
* Small refactoring
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
---------
Signed-off-by: Ryan Levick <ryan.levick@fermyon.com>
* Implement runtime::Module::function_locations_with_names()
Map the iterator returned by Module::function_locations() to another
one that returns a 3-tuple containing the function name, the offset,
and the length of each function defined in this particular module.
* Show function names in "explore" instead of just the indices
* Address review: Change iterator format
* Address review: use the new iterator struct
* Address review comments
* its tracing-subscriber now, not pretty_env_logger
* point to tracing-subscribers docs on filter directives
* correct invocation of wasmtime in the example
* Add release binaries for x86_64-musl
This was requested in bytecodealliance/wasmtime-py#237 and shouldn't
cost us too much in terms of CI resources and maintenance overhead.
* Fix combining rustflags
prtest:full
* Update wit-bindgen
This commit updates wit-bindgen to 0.25 and applies some "extra
trickery" to work around the now-default providing of the realloc
symbol.
* Add audits
We intend to use this when computing liveness of GC references in
`cranelift-frontend` to manually construct safepoints and ultimately remove
`r{32,64}` reference types from CLIF, `cranelift-codegen`, and `regalloc2`.
Co-authored-by: Trevor Elliott <telliott@fastly.com>
* cranelift: Always consider sret arguments used
In #8438 we stopped emitting register bindings for unused arguments,
based on the use-counts from `compute_use_states`. However, that doesn't
count the use of the struct-return argument that's automatically added
after lowering when the `rets` instruction is generated in the epilogue.
As a result, using a struct-return argument caused register allocation
to panic due to the VReg not being defined anywhere.
This commit adds a use to the struct-return argument so that it's always
available in the epilogue.
Fixes#8659
* Review comments
* Add Cranelift and Winch features to the C API
This commit adds `cranelift` and `winch` features to the C API and
enables them by default. This means that the C API can now be built
without compiler support to only support loading precompiled binaries.
Closes#7349
* Fix doc link
* More doc fixes
* Add more doc input dirs
This commit adds the `winch` feature to the default feature set of the
`wasmtime-cli` package meaning that the `wasmtime` CLI will, by default,
have support for the Winch compiler.
* Refactor installation of C API and features supported
This commit overhauls and refactors the management of the building of
the C API. Instead of this being script-based it's now done entirely
through CMake and CMake is the primary focus for building the C API. For
example building the C API release artifacts is now done through CMake
instead of through a shell script's `cargo build` and manually moving
artifacts.
The benefits that this brings are:
* The C API now properly hides symbols in its header files that weren't
enabled at build time. This is done through a new build-time generated
`conf.h` templated on a `conf.h.in` file in the source tree.
* The C API's project now supports enabling/disabling Cargo features to
have finer-grained support over what's included (plus auto-management
of the header file).
* Building the C API and managing it is now exclusively done through
CMake. For example invoking `doxygen` now lives in CMake, installation
lives there, etc.
The `CMakeLists.txt` file for the C API is overhauled in the process of
doing this. The build now primarily matches on the Rust target being
built rather than the host target. Additionally installation will now
install both the static and shared libraries instead of just one.
Additionally during this refactoring various bits and pieces of
Android-specific code were all removed. Management of the C toolchain
feels best left in scope of the caller (e.g. configuring `CC_*` env vars
and such) rather than here.
prtest:full
* Don't use `option` for optional strings
* Invert release build check
Also adjust some indentation
* Fix more indentation
* Remove no-longer-used variable
* Reduce duplication in feature macro
This commit enables the `Func::new` constructor and related other
functions when `cranelift` and `winch` features are both disabled,
meaning this is now available in compiler-less builds. This builds on
the support of #8629.
* Update the frame layout comment
* Remove more references to nominal SP
* Remove the nominal_sp_offset from backend emit states
* Continue removing references to the nominal sp
* Remove nominal-sp from the aarch64 backend
* Remove nominal-sp from the s390x backend
* Remove nominal-sp from the riscv64 backend
* Remove old comment
* Remove the native ABI calling convention from Wasmtime
This commit proposes removing the "native abi" calling convention used
in Wasmtime. For background this ABI dates back to the origins of
Wasmtime. Originally Wasmtime only had `Func::call` and eventually I
added `TypedFunc` with `TypedFunc::call` and `Func::wrap` for a faster
path. At the time given the state of trampolines it was easiest to call
WebAssembly code directly without any trampolines using the native ABI
that wasm used at the time. This is the original source of the native
ABI and it's persisted over time under the assumption that it's faster
than the array ABI due to keeping arguments in registers rather than
spilling them to the stack.
Over time, however, this design decision of using the native ABI has not
aged well. Trampolines have changed quite a lot in the meantime and it's
no longer possible for the host to call wasm without a trampoline, for
example. Compilations nowadays maintain both native and array
trampolines for wasm functions in addition to host functions. There's a
large split between `Func::new` and `Func::wrap`. Overall, there's quite
a lot of weight that we're pulling for the design decision of using the
native ABI.
Functionally this hasn't ever really been the end of the world.
Trampolines aren't a known issue in terms of performance or code size.
There's no known faster way to invoke WebAssembly from the host (or
vice-versa). One major downside of this design, however, is that
`Func::new` requires Cranelift as a backend to exist. This is due to the
fact that it needs to synthesize various entries in the matrix of ABIs
we have that aren't available at any other time. While this is itself
not the worst of issues it means that the C API cannot be built without
a compiler because the C API does not have access to `Func::wrap`.
Overall I'd like to reevaluate given where Wasmtime is today whether it
makes sense to keep the native ABI trampolines. Sure they're supposed to
be fast, but are they really that much faster than the array-call ABI as
an alternative? This commit is intended to measure this.
This commit removes the native ABI calling convention entirely. For
example `VMFuncRef` is now one pointer smaller. All of `TypedFunc` now
uses `*mut ValRaw` for loads/stores rather than dealing with ABI
business. The benchmarks with this PR are:
* `sync/no-hook/core - host-to-wasm - typed - nop` - 5% faster
* `sync/no-hook/core - host-to-wasm - typed - nop-params-and-results` - 10% slower
* `sync/no-hook/core - wasm-to-host - typed - nop` - no change
* `sync/no-hook/core - wasm-to-host - typed - nop-params-and-results` - 7% faster
These numbers are a bit surprising as I would have suspected no change
in both "nop" benchmarks as well as both being slower in the
params-and-results benchmarks. Regardless it is apparent that this is
not a major change in terms of performance given Wasmtime's current
state. In general my hunch is that there are more expensive sources of
overhead than reads/writes from the stack when dealing with wasm values
(e.g. trap handling, store management, etc).
Overall this commit feels like a large simplification of what we
currently do in `TypedFunc`:
* The number of ABIs that Wasmtime deals with is reduced by one. ABIs
are pretty much always tricky and having fewer moving parts should
help improve the understandability of the system.
* All of the `WasmTy` trait methods and `TypedFunc` infrastructure is
simplified. Traits now work with simple `load`/`store` methods rather
than various other flavors of conversion.
* The multi-return-value handling of the native ABI is all gone now
which gave rise to significant complexity within Wasmtime's Cranelift
translation layer in addition to the `TypedFunc` backing traits.
* This aligns components and core wasm where components always use the
array ABI and now core wasm additionally will always use the array ABI
when communicating with the host.
I'll note that this still leaves a major ABI "complexity" with respect
to native functions do not have a wasm ABI function pointer until
they're "attached" to a `Store` with a `Module`. That's required to
avoid needing Cranelift for creating host functions and that property is
still true today. This is a bit simpler to understand though now that
`Func::new` and `Func::wrap` are treated uniformly rather than one being
special-cased.
* Fix miri unsafety
prtest:full
* Use bytes for maximum size of linear memory with pooling
This commit changes configuration of the pooling allocator to use a
byte-based unit rather than a page based unit. The previous
`PoolingAllocatorConfig::memory_pages` configuration option configures
the maximum size that a linear memory may grow to at runtime. This is an
important factor in calculation of stripes for MPK and is also a
coarse-grained knob apart from `StoreLimiter` to limit memory
consumption. This configuration option has been renamed to
`max_memory_size` and documented that it's in terms of bytes rather than
pages as before.
Additionally the documented constraint of `max_memory_size` must be
smaller than `static_memory_bound` is now additionally enforced as a
minor clean-up as part of this PR as well.
* Review comments
* Fix benchmark build
* cranelift: expand umbrella crate with more crates
* Break the dependency cycle between cranelift-jit and cranelift
---------
Co-authored-by: Trevor Elliott <telliott@fastly.com>
* gen_nominal_sp_adj now returns a smallvec
* Remove the virtual sp offset from the x64 backend
* Remove the virtual sp offset from the aarch64 backend
* Remove the virtual sp offset from the riscv64 backend
* Remove the virtual sp offset from the s390x backend
* Remove gen_nomninal_sp_adj, and argument area management functions
* Remove get_virtual_sp_offset_from_state
* Code review suggestions
* Use WASM function names in compiled objects
Instead of generating symbol names in the format
"wasm[$MODULE_ID]::function[$FUNCTION_INDEX]", generate (if possible)
something more readable, such as "wasm[$MODULE_ID]::$FUNCTION_NAME".
This helps when debugging or profiling the generated code.
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
* Ensure symbol names are cleaned up and have function indexes
Filter symbol names to include only characters that are usually used
for function names, and that might be produced by name mangling.
Replace everything else with a question mark (and all repeated question
marks by a single one), and then truncate to a length of 96 characters.
This should be enough to not only avoid passing user-controlled strings
to tools such as "perf" and "objdump", and make it easier to
disambiguate symbols that might have the same name but different
indices.
* Make symbol cleaning slightly more efficient
* Update symbol names to be closer to what tests expect
* Ensure only alphanumeric ASCII characters are allowed in a symbol name
* Ensure sliced symbol name is within its bounds
* Update test expectations after adding function name to symbol name
---------
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
* Cranelift: add alignment parameter to stack slots.
Fixes#6716.
Currently, stack slots on the stack are aligned only to a machine-word
boundary. This is insufficient for some use-cases: for example, storing
SIMD data or structs that require a larger alignment.
This PR adds a parameter to the `StackSlotData` to specify alignment,
and the associated logic to the CLIF parser and printer. It updates the
shared ABI code to compute the stackslot layout taking the alignment
into account. In order to ensure the alignment is always a power of two,
it is stored as a shift amount (log2 of actual alignment) in the IR.
* Apply suggestions from code review
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* Update filetest.
* Update alignment to ValRaw vector.
* Fix printer test.
* cargo-fmt from suggestion update.
---------
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
This fixes an accidental regression from #8616 where page alignment was
implicitly happening due to how configuration was processed but it
wasn't re-added in the refactoring.
The egraph pass was already doing this, when it ran, and it never adds
any aliases. So do it slightly earlier and unconditionally, and avoid
needing to resolve any aliases during lowering.
Consumption of non-allocatable operands was added in #5253 and #5132,
and removed in #8524 and following PRs. Now they are not only ignored by
regalloc2, but the placeholders that it leaves in the allocation results
are also ignored by Cranelift. So let's stop adding them to the operands
list entirely.
This commit builds on the support from #8448 to remove all blanket impls
from the WASI crates and instead replace them with concrete impls. This
is slightly functionally different from before where impls are now on
trait objects meaning dynamic dispatch is involved where previously
dynamic dispatch was used. That being said the perf hit here is expected
to be negligible-to-nonexistent since the implementations are large
enough that the dynamic dispatch won't be the hot path.
The motivations for this commit are:
* Removes the need for an odd `skip_mut_forwarding_impls` option - but
this'll be left for a bit in case others need it.
* Improves incremental compile time of these crates where the crates
themselves now contain all object code for all of WASI instead of
forcing the final consume to codegen everything (although there's
still a significant amount monomorphized).
* Improves future compatibility with refactorings of
bindgen-generated-traits and such because blanket impls are pretty
hard to work around where concrete impls are easier to reason about
(and document).
The latter is what Wasmtime uses today but it pulls in parsers for all
object formats supported by `object`. In the context of Wasmtime,
however, we know that all objects produced are 64-bit ELF files so
there's no need to pull in, for example, a COFF parser as that'll always
return an error anyway. This commit switches uses of the `object::File`
convenience to `ElfFile64` instead.
* Change `Tunables::static_memory_bound` to bytes
This commit changes the wasm-page-sized `static_memory_bound` field to
instead being a byte-defined unit rather than a page-defined unit. To
accomplish this the field is renamed to `static_memory_reservation` and
all references are updated. This builds on the support from #8608 to
remove another page-based variable from the internals of Wasmtime.
* Fix tests
* Test that wasi file streams can handle read(0)
* Zero-sized reads don't fail for file streams
* Accidentally removed the `read(0)` when refactoring the test
* Allow env/args/preopens to exceed 64k in size
This commit fixes an issue with the wasip1 adapter published with
Wasmtime which current limits the size of environment variables,
arguments, and preopens to not exceed 64k. This bug comes from the fact
that we more-or-less forgot to account for this when designing the
adapter initially. The adapter allocates a single WebAssembly page for
itself but does not have a means of allocating much more than that. It's
technically possible to continue to call `memory.grow` or possibly
`cabi_realloc` from the original main module but it's pretty awkward.
The solution in this commit is to take an alternative approach to how
these properties are all processed. Previously arguments/env
vars/preopens were all allocated once within the adapter and stored
statically. This means that after startup they're copied from where they
reside in-memory, but the downside is that we have to have enough memory
to hold everything. This commit instead tries to "stream" the items so
they're never held entirely within the adapter itself.
The general idea in this commit is to use the "align" parameter to
`cabi_import_realloc` to figure out what's being allocated and route the
allocation to the destination. For example an allocation with alignment
1 is a string and can go directly into a user-supplied pointer where an
allocation with alignment 4 is a pointer-based allocation which must be
stored within the adapter, but only temporarily.
With this redesign it's now possible to have the sum total of
args/envs/preopens to exceed 64k. The new limitation is that the
max-length string plus size of the max length of these arrays must be
less than 64k. This should be a more reasonable limit than before where
any one individual argument/env var is unlikely to exceed 64k (or get
close).
Closes#8556
* Comment descriptors are closed
* Update crates/wasi-preview1-component-adapter/src/descriptors.rs
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* Turn down process limits for macOS
Looks like a 1M env block is a bit too large.
---------
Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
* wasmtime: Make table lazy-init configurable
Lazy initialization of tables has trade-offs that we haven't explored in
a while. Making it configurable makes it easier to test the effects of
these trade-offs on a variety of WebAssembly programs, and allows
embedders to decide whether the trade-offs are worth-while for their use
cases.
* Review comments
This commit aims to address #8607 by dynamically determining whether the
pooling allocator should be used rather than unconditionally using it.
It looks like some systems don't have enough virtual memory to support
the default configuration settings so this should help `wasmtime serve`
work on those systems.
Closes#8607