* Allow 64-bit vectors and implement for interpreter
The AArch64 backend already supports 64-bit vectors; this simply allows
instructions to make use of that.
Implemented support for 64-bit vectors within the interpreter to allow
interpret runtests to use them.
Copyright (c) 2022 Arm Limited
* Disable 64-bit SIMD `iaddpairwise` tests on s390x
Copyright (c) 2022 Arm Limited
* [AArch64] Port SIMD narrowing to ISLE
Fvdemote, snarrow, unarrow and uunarrow.
Also refactor the aarch64 instructions descriptions to parameterize
on ScalarSize instead of using different opcodes.
The zero_value pure constructor has been introduced and used by the
integer narrow operations and it replaces, and extends, the compare
zero patterns.
Copright (c) 2022, Arm Limited.
* use short 'if' patterns
This enables more runtests to be executed on s390x. Doing so
uncovered a two back-end bugs, which are fixed as well:
- The result of cls was always off by one.
- The result of popcnt.i16 has uninitialized high bits.
In addition, I found a bug in the load-op-store.clif test case:
v3 = heap_addr.i64 heap0, v1, 4
v4 = iconst.i64 42
store.i32 v4, v3
This was clearly intended to perform a 32-bit store, but
actually performs a 64-bit store (it seems the type annotation
of the store opcode is ignored, and the type of the operand
is used instead). That bug did not show any noticable symptoms
on little-endian architectures, but broke on big-endian.
* support dynamic function calls in component model
This addresses #4310, introducing a new `component::values::Val` type for
representing component values dynamically, as well as `component::types::Type`
for representing the corresponding interface types. It also adds a `call` method
to `component::func::Func`, which takes a slice of `Val`s as parameters and
returns a `Result<Val>` representing the result.
Note that I've moved `post_return` and `call_raw` from `TypedFunc` to `Func`
since there was nothing specific to `TypedFunc` about them, and I wanted to
reuse them. The code in both is unchanged beyond the trivial tweaks to make
them fit in their new home.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* order variants and match cases more consistently
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* implement lift for String, Box<str>, etc.
This also removes the redundant `store` parameter from `Type::load`.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* implement code review feedback
This fixes a few issues:
- Bad offset calculation when lowering
- Missing variant padding
- Style issues regarding `types::Handle`
- Missed opportunities to reuse `Lift` and `Lower` impls
It also adds forwarding `Lift` impls for `Box<[T]>`, `Vec<T>`, etc.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* move `new_*` methods to specific `types` structs
Per review feedback, I've moved `Type::new_record` to `Record::new_val` and
added a `Type::unwrap_record` method; likewise for the other kinds of types.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* make tuple, option, and expected type comparisons recursive
These types should compare as equal across component boundaries as long as their
type parameters are equal.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* improve error diagnostic in `Type::check`
We now distinguish between more failure cases to provide an informative error
message.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* address review feedback
- Remove `WasmStr::to_str_from_memory` and `WasmList::get_from_memory`
- add `try_new` methods to various `values` types
- avoid using `ExactSizeIterator::len` where we can't trust it
- fix over-constrained bounds on forwarded `ComponentType` impls
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* rearrange code per review feedback
- Move functions from `types` to `values` module so we can make certain struct fields private
- Rename `try_new` to just `new`
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* remove special-case equality test for tuples, options, and expecteds
Instead, I've added a FIXME comment and will open an issue to do recursive
structural equality testing.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* cranelift: Restrict `br_table` to `i32` indices
In #4498 it was proposed that we should only accept `i32` indices
to `br_table`. The rationale for this is that larger types lead the
users to a false sense of flexibility (since we don't support jump
tables larger than u32's), and narrower types are not well tested
paths that would be safer if we removed them.
* cranelift: Reduce directly from i128 to i32 in Switch
Converted the existing implementations for the following opcodes to ISLE
on AArch64:
- `sqrt`
- `fneg`
- `fabs`
- `fpromote`
- `fdemote`
- `ceil`
- `floor`
- `trunc`
- `nearest`
Copyright (c) 2022 Arm Limited
In #4502 we discovered a bug in the switch api where it would emit
`icmp_imm`'s with types that were not able to fully represent the
destination index.
We now reject these inputs. The index val must always have a
type that is capable of addressing the entire range of inputs.
* Add cmake compatibility to c-api
* Add CMake documentation to wasmtime.h
* Add CMake instructions in examples
* Modify CI for CMake support
* Use correct rust in CI
* Trigger build
* Refactor run-examples
* Reintroduce example_to_run in run-examples
* Replace run-examples crate with cmake
* Fix markdown formatting in examples readme
* Fix cmake test quotes
* Build rust wasm before cmake tests
* Pass CTEST_OUTPUT_ON_FAILURE
* Another cmake test
* Handle os differences in cmake test
* Fix bugs in memory and multimemory examples
I noticed that `TableOp::insert` had assertions that `num_params` and
`table_size` were greater than 0, but no assert for `num_globals`. These
asserts couldn't be hit because the `*_RANGE` constants were all set to
a minimum of 1.
But the only reason I can see to prohibit 0-sized tables, locals, or
globals, was because indexes into those spaces were generated with the
`%` operator. Allowing 0-sized spaces requires not generating the
corresponding instructions at all when there are no valid indexes.
So I pushed the final selection of which table/local/global to access
earlier, to the moment when we're picking which TableOps to run. Then,
instead of generating a random u8 or u32 and taking the remainder to get
it into the right range, I can just ask `arbitrary` to generate a number
in the right range to begin with.
So this now explores some size-0 corners that it didn't before, and it
doesn't require reasoning about whether remainder can divide by zero.
Also I think it uses fewer bits of the `Unstructured` input to produce
the same cases, and I hope that lets libFuzzer more quickly find bits it
can mutate to get to novel coverage paths.
On s390x, we do not have a frame pointer that can be used to chain
stack frames for easy unwinding. Instead, our ABI defines a stack
"backchain" mechanism that can be used to the same effect.
This PR uses that backchain mechanism to implement the new
preserve_frame_pointers flags introduced here:
https://github.com/bytecodealliance/wasmtime/pull/4469
This includes some changes from @bnjbvr to the trace-logging/annotation
to reduce overhead when logging is enabled but only non-RA2 subsystems
are at `Trace` level.
* Components: ignore type exports (for now).
This commit updates component translation to ignore type exports for now.
Components generated with `wit-component` contain type exports to give names to
types used within the component's functions based on the component's wit
definition.
The intention is to allow bindings to be generated with meaningful names
directly from a component. In the future, type exports (and imports) may be
used for more than this purpose to support things like resource types.
This commit effectively ignores type exports when translating the component as
they are not useful to executing a component at this time.
Closes#4415.
* Code review feedback.
* fuzzgen: Add float support
Add support for generating floats and some float instructions.
* fuzzgen: Enable NaN Canonicalization
Both IEEE754 and the Wasm spec are somewhat loose about what is allowed
to be returned from NaN producing operations. And in practice this changes
from X86 to Aarch64 and others. Even in the same host machine, the
interpreter may produce a code sequence different from cranelift that
generates different NaN's but produces legal results according to the spec.
These differences cause spurious failures in the fuzzer. To fix this
we enable the NaN Canonicalization pass that replaces any NaN's produced
with a single fixed canonical NaN value.
* fuzzgen: Use `MultiAry` when inserting opcodes
This deduplicates a few inserters!
* Skip new `table_ops` test under emulation
When emulating we already have to disable most pooling-allocator related
tests so this commit carries over that logic to the new fuzz test which
may run some configurations with the pooling allocator depending on the
random input.
* Fix panics in s390x codegen related to aliases
This commit fixes an issue introduced as part of the fix for
GHSA-5fhj-g3p3-pq9g. The `reftyped_vregs` list given to `regalloc2` is
not allowed to have duplicates in it and while the list originally
doesn't have duplicates once aliases are applied the list may have
duplicates. The fix here is to perform another pass to remove duplicates
after the aliases have been processed.
* Improve cranelift disassembly of stack maps
Print out extra information about stack maps such as their contents and
other related metadata available. Additionally also print out addresses
in hex to line up with the disassembly otherwise printed as well.
* Improve the `table_ops` fuzzer
* Generate more instructions by default
* Fix negative indices appearing in `table.{get,set}`
* Assert that the traps generated are expected to prevent accidental
other errors reporting a fuzzing success.
* Fix `reftype_vregs` reported to `regalloc2`
This fixes a mistake in the register allocation of Cranelift functions
where functions using reference-typed arguments incorrectly report which
virtual registers are reference-typed values if there are vreg aliases
in play. The fix here is to apply the vreg aliases to the final list of
reftyped regs which is eventually passed to `regalloc2`.
The main consequence of this fix is that functions which previously
accidentally didn't have correct stack maps should now have the missing
stack maps.
* Add a test that `table_ops` gc's eventually
* Add a comment about new alias resolution
* Update crates/fuzzing/src/oracles.rs
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
* Add some comments
Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>
With branch protections enabled that would otherwise mean that the PR
cannot be landed since CI is now required to run. These date-update PRs
typically come at odd off-hours for Wasmtime anyway so it should be fine
to run CI.
Preserving frame pointers -- even inside leaf functions -- makes it easy to
capture the stack of a running program, without requiring any side tables or
metadata (like `.eh_frame` sections). Many sampling profilers and similar tools
walk frame pointers to capture stacks. Enabling this option will play nice with
those tools.
These were for x86 (32-bit) where the ISA didn't have instructions for these
things, but now that we don't support that, and always have SSE2 for x86_64, we
never need or use these libcalls anymore.
Converted the existing implementations for the following Opcodes to ISLE on AArch64:
- `fadd`
- `fsub`
- `fmul`
- `fdiv`
- `fmin`
- `fmax`
- `fmin_pseudo`
- `fmax_pseudo`
Copyright (c) 2022 Arm Limited
This adds full support for all Cranelift SIMD instructions
to the s390x target. Everything is matched fully via ISLE.
In addition to adding support for many new instructions,
and the lower.isle code to match all SIMD IR patterns,
this patch also adds ABI support for vector types.
In particular, we now need to handle the fact that
vector registers 8 .. 15 are partially callee-saved,
i.e. the high parts of those registers (which correspond
to the old floating-poing registers) are callee-saved,
but the low parts are not. This is the exact same situation
that we already have on AArch64, and so this patch uses the
same solution (the is_included_in_clobbers callback).
The bulk of the changes are platform-specific, but there are
a few exceptions:
- Added ISLE extractors for the Immediate and Constant types,
to enable matching the vconst and swizzle instructions.
- Added a missing accessor for call_conv to ABISig.
- Fixed endian conversion for vector types in data_value.rs
to enable their use in runtests on the big-endian platforms.
- Enabled (nearly) all SIMD runtests on s390x. [ Two test cases
remain disabled due to vector shift count semantics, see below. ]
- Enabled all Wasmtime SIMD tests on s390x.
There are three minor issues, called out via FIXMEs below,
which should be addressed in the future, but should not be
blockers to getting this patch merged. I've opened the
following issues to track them:
- Vector shift count semantics
https://github.com/bytecodealliance/wasmtime/issues/4424
- is_included_in_clobbers vs. link register
https://github.com/bytecodealliance/wasmtime/issues/4425
- gen_constant callback
https://github.com/bytecodealliance/wasmtime/issues/4426
All tests, including all newly enabled SIMD tests, pass
on both z14 and z15 architectures.
* Implement `iabs` in ISLE (AArch64)
Converts the existing implementation of `iabs` for AArch64 into ISLE,
and fixes support for `iabs` on scalar values.
Copyright (c) 2022 Arm Limited.
* Improve scalar `iabs` implementation.
Also introduces `CSNeg` instruction.
Copyright (c) 2022 Arm Limited
* Convert `scalar_to_vector` to ISLE (AArch64)
Converted the exisiting implementation of `scalar_to_vector` for AArch64 to
ISLE.
Copyright (c) 2022 Arm Limited
* Add support for floats and fix FpuExtend
- Added rules to cover `f32 -> f32x4` and `f64 -> f64x2` for
`scalar_to_vector`
- Added tests for `scalar_to_vector` on floats.
- Corrected an invalid instruction emitted by `FpuExtend` on 64-bit
values.
Copyright (c) 2022 Arm Limited
This commit augments the current translation phase of components with
extra machinery to track the type information of component items such as
instances, components, and functions. The end goal of this commit is to
enable the `Lower` instruction to know the type of the component
function being lowered. Currently during the inlining pass where
component fusion is detected the type of the lifted function is known,
but to implement fusion entirely the type of the lowered function must
be known. Note that these two types are expected to be different to
allow for the subtyping rules specified by the component model.
For now nothing is actually done with this information other than noting
its presence in the face of a lifted-then-lowered function. My hope
though was to split this out for a separate review to avoid making a
future component-adapter-compiler-containing-PR too large.
This moves them into a new `wasmtime-asm-macros` crate that can be used not just
from the `wasmtime-fibers` crate but also from other crates (e.g. we will need
them in https://github.com/bytecodealliance/wasmtime/pull/4431).
This commit fixes an issue with the initialization of element segments
when one of the elements in the element segment is `ref.func null`.
Previously the contents of a table were accidentally initialized with
the raw value of the `*mut VMCallerCheckedAnyfunc` which bypassed the
"this is initialized" encoding of function table entries that Wasmtime
uses for lazy table initialization. The fix here was to ensure that the
encoded form is used.
The impact of this issue is that a module could panic at runtime when
accessing a table element that was initialized with an element segment
containing a `ref.null func` entry. This only happens with imported
tables in a WebAssembly module where the table itself was defined on the
host. If the table was defined in another wasm module or in the local
wasm module this bug would not occur. Additionally this bug requires
enabling the reference types proposal for WebAssembly (which is enabled
by default) due to the usage of encodings for null funcrefs in element
segments.
When parsing isa specific values we were accidentally discarding the
value of the flag, and treating it always as a boolean flag.
This would cause a `clif-util` invocation such as
`cargo run -- compile -D --set has_sse41=false --target x86_64 test.clif`
to be interpreted as `--set has_sse41` and enable that feature instead
of disabling it.
Rather than sometimes using `file-per-thread-logger`.
Also remove the debug CLI flags, so that we can always just define
`RUST_LOG=...` to get logging and don't need to also do other things.