* wasi-common: need FileEntry to track the mode with which a file was opened
* add a test for read, write access modes in path_open
* a directory needs to report full set of rights in fdstat
implementations such as wasi-libc use these as a mask for any rights
they request for a new fd when invoking path_open
* preview2: mask file perms. and we really need to enforce them on stream creation
which will mean editing the wit.
* filesystem.wit: make {read, write, append}-via-stream fallible with an error-code
in order to fail appropriately when a file is not open for reading or
writing.
* filesystem.wit: update comments
* preview2 impls: fix error handling for {read,write,append}_via_stream
* wasi-common: normalize wrong access mode error on windows to badf
* remove kubkob from labeler
I appreciate his work on this stuff in the past but he hasnt been around
for a few years now
* preview 2 filesystem: allow stat on any opened fd
since file perms is now more accurately the file's access mode, a file
open for writing was having problems here.
in reality i dont think it makes sense to restrict this based on the
perms. if a file is opened, you should be able to stat it.
this fixes the host adapter's path_link test, which was broken by prior
changes. additionally, this fix lets the path_open_dirfd_not_dir test
pass.
* fix the directory base & inheriting rights
in order to work with wasi-testsuite, it needs to be possible to
path_open(dirfd, ".", ...) with the same rights reported in the
fdstat of that dirfd. When we report the Rights::all() set, both
FD_READ and FD_WRITE are set in the base rights, which results in
unix rejecting an openat2(dirfd, ".", O_RDWR) with EISDIR.
By not having the FD_READ and FD_WRITE rights present in the base
rights, the open syscall defaults to O_RDONLY, which is the only
access mode allowed for opening directories.
* path_open of a directory with read and write succeeds on windows
* Update Wasmtime for upcoming WIT changes
This PR integrates bytecodealliance/wasm-tools#1027 into Wasmtime. The
main changes here are:
* WIT syntax is updated with WebAssembly/component-model#193
* Generated bindings in the `bindgen!` macro have been updated to
reflect the new structure of WIT.
* The accepted component model binary format has been updated to account
for changes.
This PR disables wasi-http tests and the on-by-default feature because
the WIT syntax has been updated but the submodule containing the WITs
has not been updated yet so there's no way to get that building
temporarily. Once that's updated then this can be reenabled.
* Update wasmtime-wasi crate with new WIT
* Add wit-bindgen override for the updated version
* Officially disable wasi-http tests/building
* Move test-reactor WIT into the main WIT files
Don't store duplicates with the rest of the WASI WIT files we have.
* Remove adapter's copy of WIT files
* Disable default features for wit-bindgen
* Plumb disabling wasi-http tests a bit more
* Fix reactor tests and adapter build
* Remove no-longer-needed feature
* Update adapter verification script
* Back out some wasi-http hacks
* Update vet and some dependency sources
* Move where wit-bindgen comes from
Make it a more "official" location which is also less likely to be
accidentally deleted in the future.
* Don't document wasi-http-tests
* Add adapter build to dependency for the entire CI build
* Build adapter on PR CI when it changes
* Update adapter verification for new stdin/stdout/stderr
* Update reactor build script
* Reduce some duplication with some shared variables
* Rename adapters to `wasi_snapshot_preview1.{reactor,command}.wasm`
* Fix upload pattern
* winch(fuzz): Refactor Winch's fuzzing
This change is a follow-up to the discussion in
https://github.com/bytecodealliance/wasmtime/pull/6281.
The most notable characteristic of this change is that it enables
`winch` by default in the fuzzers. If compilation time is a big enough
concern I can add the cargo feature back. I opted to enable `winch` by
default for several reasons:
* It substantially reduces the `cfg` complexity -- at first I thought
I had covered all the places in which a `cfg` check would be needed,
but then I realized that I missed the Cranelift specific compiler
flags.
* It's the fastest route to enable winch by default in the fuzzers,
which we want to do eventually -- the only change we'd need at that
point would be to get rid of the winch-specific environment variable.
* We can get rid of the winch-specific checks in CI for fuzzing
* Implement Arbitraty for CompilerStrategy
Unconditionally return `Cranelift` for the `Arbitrary` implementation of
`CompilerStrategy`. This ensures that `Cranelift` is used as the
compiler for all the targets unless explicitly requested otherwise. As
of this change, only the differential target overrides the
`CompilerStrategy`
this is causing a link error because pulldown_cmark is available both in
the rust_wasi_markdown_parser deps directory and, now, in the root of
the project as well.
Currently all the jobs in GitHub Actions are configured to cancel the
entire run if any of them fail. This is similar to `fail-fast: true` but
that can't be used because the jobs aren't all part of the same matrix.
The original goal for this was to save CI time on the merge queue where
if something failed it should get out of the merge queue quickly instead
of waiting for all other jobs to succeed and then report the one quick
failure.
This seems to run into an issue, though, where if a Windows builder
fails then running cancellation will instead mark the Windows builder as
cancelled instead of failed. This can be confusing for test failures
because nothing looks obvious and each log needs to be manually pored
over.
To help address this the commit here disables auto-cancellation of the
"test" matrix of jobs, re-enabling `fail-fast: true` at the same time.
This way test if a test job fails it won't actually cancel everything
else so everything else will run to completion, but given that tests are
typically the longest part of the build that's hopefully ok.
* Make Wasmtime compatible with Stacked Borrows in MIRI
The fact that Wasmtime executes correctly under Tree Borrows but not
Stacked Borrows is a bit suspect and given what I've since learned about
the aliasing models I wanted to give it a stab to get things working
with Stacked Borrows. It turns out that this wasn't all that difficult,
but required two underlying changes:
* First the implementation of `Instance::vmctx` is now specially crafted
in an intentional way to preserve the provenance of the returned
pointer. This way all `&Instance` pointers will return a `VMContext`
pointer with the same provenance and acquiring the pointer won't
accidentally invalidate all prior pointers.
* Second the conversion from `VMContext` to `Instance` has been updated
to work with provenance and such. Previously the conversion looked
like `&mut VMContext -> &mut Instance`, but I think this didn't play
well with MIRI because `&mut VMContext` has no provenance over any
data since it's zero-sized. Instead now the conversion is from `*mut
VMContext` to `&mut Instance` where we know that `*mut VMContext` has
provenance over the entire instance allocation. This shuffled a fair
bit around to handle the new closure-based API to prevent escaping
pointers, but otherwise no major change other than the structure and
the types in play.
This commit additionally picks up a dependency on the `sptr` crate which
is a crate for prototyping strict-provenance APIs in Rust. This is I
believe intended to be upstreamed into Rust one day (it's in the
standard library as a Nightly-only API right now) but in the meantime
this is a stable alternative.
* Clean up manual `unsafe impl Send` impls
This commit adds a new wrapper type `SendSyncPtr<T>` which automatically
impls the `Send` and `Sync` traits based on the `T` type contained.
Otherwise it works similarly to `NonNull<T>`. This helps clean up a
number of manual annotations of `unsafe impl {Send,Sync} for ...`
throughout the runtime.
* Remove pointer-to-integer casts with tables
In an effort to enable MIRI's "strict provenance" mode this commit
removes the integer-to-pointer casts in the runtime `Table`
implementation for Wasmtime. Most of the bits were already there to
track all this, so this commit plumbed around the various pointer types
and with the help of the `sptr` crate preserves the provenance of all
related pointers.
* Remove integer-to-pointer casts in CoW management
The `MemoryImageSlot` type stored a `base: usize` field mostly because I
was too lazy to have a `Send`/`Sync` type as a pointer, so this commit
updates it to use `SendSyncPtr<u8>` and then plumbs the pointer-ness
throughout the implementation. This removes all integer-to-pointer casts
and has pointers stores as actual pointers when they're at rest.
* Remove pointer-to-integer casts in "raw" representations
This commit changes the "raw" representation of `Func` and `ExternRef`
to a `*mut c_void` instead of the previous `usize`. This is done to
satisfy MIRI's requirements with strict provenance, properly marking the
intermediate value as a pointer rather than round-tripping through
integers.
* Minor remaining cleanups
* Switch to Stacked Borrows for MIRI on CI
Additionally enable the strict-provenance features to force warnings
emitted today to become errors.
* Fix a typo
* Replace a negative offset with `sub`
* Comment the sentinel value
* Use NonNull::dangling
In #6338 that PR is bouncing on CI due to Windows running out of disk
space. Locally a `./ci/run-tests.sh` compile produces a ~13G build
directory. Testing on CI it looks like Windows builders have ~13G of
free space on them for GitHub Actions. I think something in that PR has
tipped us just over the edge of requiring too much space, although I'm
not sure exactly what.
To address the issue this commit disables DWARF debug information
entirely on all builders on CI. Not only should this save a sliver of
compile time but this reduces a local build directory to ~4.7G, over a
50% reduction. That should keep us fitting in CI for more time to come
hopefully.
* Run some tests in MIRI on CI
This commit is an implementation of getting at least chunks of Wasmtime
to run in MIRI on CI. The full test suite is not possible to run in MIRI
because MIRI cannot run Cranelift-produced code at runtime (aka it
doesn't support JITs). Running MIRI, however, is still quite valuable if
we can manage it because it would have trivially detected
GHSA-ch89-5g45-qwc7, our most recent security advisory. The goal of this
PR is to select a subset of the test suite to execute in CI under MIRI
and boost our confidence in the copious amount of `unsafe` code in
Wasmtime's runtime.
Under MIRI's default settings, which is to use the [Stacked
Borrows][stacked] model, much of the code in `Instance` and `VMContext`
is considered invalid. Under the optional [Tree Borrows][tree] model,
however, this same code is accepted. After some [extremely helpful
discussion][discuss] on the Rust Zulip my current conclusion is that
what we're doing is not fundamentally un-sound but we need to model it
in a different way. This PR, however, uses the Tree Borrows model for
MIRI to get something onto CI sooner rather than later, and I hope to
follow this up with something that passed Stacked Borrows. Additionally
that'll hopefully make this diff smaller and easier to digest.
Given all that, the end result of this PR is to get 131 separate unit
tests executing on CI. These unit tests largely exercise the embedding
API where wasm function compilation is not involved. Some tests compile
wasm functions but don't run them, but compiling wasm through Cranelift
in MIRI is so slow that it doesn't seem worth it at this time. This does
mean that there's a pretty big hole in MIRI's test coverage, but that's
to be expected as we're a JIT compiler after all.
To get tests working in MIRI this PR uses a number of strategies:
* When platform-specific code is involved there's now `#[cfg(miri)]` for
MIRI's version. For example there's a custom-built "mmap" just for
MIRI now. Many of these are simple noops, some are `unimplemented!()`
as they shouldn't be reached, and some are slightly nontrivial
implementations such as mmaps and trap handling (for native-to-native
function calls).
* Many test modules are simply excluded via `#![cfg(not(miri))]` at the
top of the file. This excludes the entire module's worth of tests from
MIRI. Other modules have `#[cfg_attr(miri, ignore)]` annotations to
ignore tests by default on MIRI. The latter form is used in modules
where some tests work and some don't. This means that all future test
additions will need to be effectively annotated whether they work in
MIRI or not. My hope though is that there's enough precedent in the
test suite of what to do to not cause too much burden.
* A number of locations are fixed with respect to MIRI's analysis. For
example `ComponentInstance`, the component equivalent of
`wasmtime_runtime::Instance`, was actually left out from the fix for
the CVE by accident. MIRI dutifully highlighted the issues here and
I've fixed them locally. Some locations fixed for MIRI are changed to
something that looks similar but is subtly different. For example
retaining items in a `Store<T>` is now done with a Wasmtime-specific
`StoreBox<T>` type. This is because, with MIRI's analyses, moving a
`Box<T>` invalidates all pointers derived from this `Box<T>`. We don't
want these semantics, so we effectively have a custom `Box<T>` to suit
our needs in this regard.
* Some default configuration is different under MIRI. For example most
linear memories are dynamic with no guards and no space reserved for
growth. Settings such as parallel compilation are disabled. These are
applied to make MIRI "work by default" in more places ideally. Some
tests which perform N iterations of something perform fewer iterations
on MIRI to not take quite so long.
This PR is not intended to be a one-and-done-we-never-look-at-it-again
kind of thing. Instead this is intended to lay the groundwork to
continuously run MIRI in CI to catch any soundness issues. This feels,
to me, overdue given the amount of `unsafe` code inside of Wasmtime. My
hope is that over time we can figure out how to run Wasm in MIRI but
that may take quite some time. Regardless this will be adding nontrivial
maintenance work to contributors to Wasmtime. MIRI will be run on CI for
merges, MIRI will have test failures when everything else passes,
MIRI's errors will need to be deciphered by those who have probably
never run MIRI before, things like that. Despite all this to me it seems
worth the cost at this time. Just getting this running caught two
possible soundness bugs in the component implementation that could have
had a real-world impact in the future!
[stacked]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md
[tree]: https://perso.crans.org/vanille/treebor/
[discuss]: https://rust-lang.zulipchat.com/#narrow/stream/269128-miri/topic/Tree.20vs.20Stacked.20Borrows.20.26.20a.20debugging.20question
* Update alignment comment
Previously an `event` filter was applied to lookup the merge queue's
github run ID but this filter doesn't work after #6288. The filter isn't
strictly necessary, though, so remove it.
Use this to prime caches used by PRs to `main` and additionally the
merge queue used to merge into `main`.
While I'm here additionally update the trigger for merge-queue-based PRs
to use `merge_group:` now that it's been fixed.
Closes#6285
It seems that this fell through given that the incremental cache is
behind a cargo feature. I noticed this while building
`cranelift-codegen` via `cargo build --all-features`.
I decided to add a check in CI to hopefully prevent this in the future,
but I'm happy to remove it / update it if there's a better way or another way.
* winch(fuzz): Initial support for differential fuzzing
This commit introduces initial support for differential fuzzing for Winch. In
order to fuzz winch, this change introduces the `winch` cargo feature. When the
`winch` cargo feature is enabled the differential fuzz target uses `wasmi` as
the differential engine and `wasm-smith` and `single-inst` as the module sources.
The intention behind this change is to have a *local* approach for fuzzing and
verifying programs generated by Winch and to have an initial implementation that
will allow us to eventually enable this change by default. Currently it's not
worth it to enable this change by default given all the filtering that needs to
happen to ensure that the generated modules are supported by Winch.
It's worth noting that the Wasm filtering code will be temporary, until Winch
reaches feature parity in terms of Wasm operators.
* Check build targets with the `winch` feature flag
* Rename fuzz target feature to `fuzz-winch`
* cranelift-native: Move riscv to separate module
* cranelift-native: Read /proc/cpuinfo to parse RISC-V extensions
* ci: Add QEMU cpuinfo emulation patch
This patch emulates the /proc/cpuinfo interface for RISC-V. This
allows us to do feature detection for the RISC-V backend.
It has been queued for QEMU 8.1 so we should remove it as soon as
that is available.
* ci: Enable QEMU RISC-V extensions
* cranelift-native: Cleanup ISA string parsing
Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
* cranelift-native: Rework `/proc/cpuinfo` parsing
Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
---------
Co-authored-by: Jamey Sharp <jsharp@fastly.com>
In #6089, I accidentally left in a change that pegged the
`install-openvino-action` to a commit instead of the latest released
version. This change uses the latest released version.
(Run all CI actions: prtest:full)
* ci: unpin the wasi-nn tasks from an older Ubuntu
Previously, OpenVINO's lack of APT packages for Ubuntu 22.04 (`jammy`)
prevented us from upgrading the GitHub runner to use `ubuntu-latest`. I
updated the `install-openvino-action` to substitute in the `focal`
packages in this case (this is what the OpenVINO team considers the fix)
so this pin should no longer be necessary. Fixes#5408.
(Run all CI actions: prtest:full)
* vet: audit the openvino version bump
Overall, I'm just trying to make these bits of documentation reflect our
process as it stands today. There are some specific changes I want to
draw attention to though.
Asking new contributors to pick a reviewer is a waste of time for two
reasons: Only people with write access to the repository are allowed to
pick reviewers, and new contributors have no idea who would be a good
reviewer for their PR anyway. So I'm deleting all mention of that. We
now auto-assign reviewers instead.
By the time someone is opening a PR, asking them to open an issue just
makes extra work for everyone. They've already picked an approach
without discussing it; we might as well look at what they did. We may
then have to ask them to take a different approach, but at that point,
asking them to open an issue won't save them any effort.
I removed mention of tests from the pull request template. There are
many things we'd like to see in a PR, and we may have to ask for them
during review if the contributor doesn't follow our development process
documentation. But I think the only crucial information for starting a
review is the two questions I'm leaving in the template: why do you want
this, and where can I find more context?
The code of conduct link still had the branch name as `master`, which is
a hint at how long it's been since anyone reviewed it.
* Integrate experimental HTTP into wasmtime.
* Reset Cargo.lock
* Switch to bail!, plumb options partially.
* Implement timeouts.
* Remove generated files & wasm, add Makefile
* Remove generated code textfile
* Update crates/wasi-http/Cargo.toml
Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com>
* Update crates/wasi-http/Cargo.toml
Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com>
* Extract streams from request/response.
* Fix read for len < buffer length.
* Formatting.
* types impl: swap todos for traps
* streams_impl: idioms, and swap todos for traps
* component impl: idioms, swap all unwraps for traps, swap all todos for traps
* http impl: idiom
* Remove an unnecessary mut.
* Remove an unsupported function.
* Switch to the tokio runtime for the HTTP request.
* Add a rust example.
* Update to latest wit definition
* Remove example code.
* wip: start writing a http test...
* finish writing the outbound request example
havent executed it yet
* better debug output
* wasi-http: some stubs required for rust rewrite of the example
* add wasi_http tests to test-programs
* CI: run the http tests
* Fix some warnings.
* bump new deps to latest releases (#3)
* Add tests for wasi-http to test-programs (#2)
* wip: start writing a http test...
* finish writing the outbound request example
havent executed it yet
* better debug output
* wasi-http: some stubs required for rust rewrite of the example
* add wasi_http tests to test-programs
* CI: run the http tests
* bump new deps to latest releases
h2 0.3.16
http 0.2.9
mio 0.8.6
openssl 0.10.48
openssl-sys 0.9.83
tokio 1.26.0
---------
Co-authored-by: Brendan Burns <bburns@microsoft.com>
* Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs
* Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs
* Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs
* wasi-http: fix cargo.toml file and publish script to work together (#4)
unfortunately, the publish script doesn't use a proper toml parser (in
order to not have any dependencies), so the whitespace has to be the
trivial expected case.
then, add wasi-http to the list of crates to publish.
* Update crates/test-programs/build.rs
* Switch to rustls
* Cleanups.
* Merge switch to rustls.
* Formatting
* Remove libssl install
* Fix tests.
* Rename wasi-http -> wasmtime-wasi-http
* prtest:full
Conditionalize TLS on riscv64gc.
* prtest:full
Fix formatting, also disable tls on s390x
* prtest:full
Add a path parameter to wit-bindgen, remove symlink.
* prtest:full
Fix tests for places where SSL isn't supported.
* Update crates/wasi-http/Cargo.toml
---------
Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com>
Co-authored-by: Pat Hickey <phickey@fastly.com>
Co-authored-by: Pat Hickey <pat@moreproductive.org>
* Adding in trampoline compiling method for ISA
* Adding support for indirect call to memory address
* Refactoring frame to externalize defined locals, so it removes WASM depedencies in trampoline case
* Adding initial version of trampoline for testing
* Refactoring trampoline to be re-used by other architectures
* Initial wiring for winch with wasmtime
* Add a Wasmtime CLI option to select `winch`
This is effectively an option to select the `Strategy` enumeration.
* Implement `Compiler::compile_function` for Winch
Hook this into the `TargetIsa::compile_function` hook as well. Currently
this doesn't take into account `Tunables`, but that's left as a TODO for
later.
* Filling out Winch append_code method
* Adding back in changes from previous branch
Most of these are a WIP. It's missing trampolines for x64, but a basic
one exists for aarch64. It's missing the handling of arguments that
exist on the stack.
It currently imports `cranelift_wasm::WasmFuncType` since it's what's
passed to the `Compiler` trait. It's a bit awkward to use in the
`winch_codegen` crate since it mostly operates on `wasmparser` types.
I've had to hack in a conversion to get things working. Long term, I'm
not sure it's wise to rely on this type but it seems like it's easier on
the Cranelift side when creating the stub IR.
* Small API changes to make integration easier
* Adding in new FuncEnv, only a stub for now
* Removing unneeded parts of the old PoC, and refactoring trampoline code
* Moving FuncEnv into a separate file
* More comments for trampolines
* Adding in winch integration tests for first pass
* Using new addressing method to fix stack pointer error
* Adding test for stack arguments
* Only run tests on x86 for now, it's more complete for winch
* Add in missing documentation after rebase
* Updating based on feedback in draft PR
* Fixing formatting on doc comment for argv register
* Running formatting
* Lock updates, and turning on winch feature flags during tests
* Updating configuration with comments to no longer gate Strategy enum
* Using the winch-environ FuncEnv, but it required changing the sig
* Proper comment formatting
* Removing wasmtime-winch from dev-dependencies, adding the winch feature makes this not necessary
* Update doc attr to include winch check
* Adding winch feature to doc generation, which seems to fix the feature error in CI
* Add the `component-model` feature to the cargo doc invocation in CI
To match the metadata used by the docs.rs invocation when building docs.
* Add a comment clarifying the usage of `component-model` for docs.rs
* Correctly order wasmtime-winch and winch-environ in the publish script
* Ensure x86 test dependencies are included in cfg(target_arch)
* Further constrain Winch tests to x86_64 _and_ unix
---------
Co-authored-by: Alex Crichton <alex@alexcrichton.com>
Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com>
There was a missing rest field access. In addition createTag doesn't actually create a tag, it creates a tag object. A tag object is an object which references a commit or other kind of object and has various kinds of metadata. You need to store it in a reference stored in refs/tags/ to actually show as tag in the git ui. The code to update the tag however creates a lightweight tag (which is a file in refs/tags/ which directly references a commit rather than a tag object) as such do the same when creating the initial dev tag by using createRef with a commit id as object sha.
See also https://git-scm.com/book/en/v2/Git-Internals-Git-References for the difference between a lightweight tag and an annotated tag.
* Speed up index fetches on CI
Use the `sparse` protocol from Rust 1.68.0 which should shave a minute
or two off most steps on CI.
* Update nightly toolchains in CI
prtest:full
* Fix date
Aside from a few new features (notably automatic registry suggestions), this
release removes the need to import description for criteria that are not
directly used, and adds an explicit version to the cargo-vet instance.
Noted in #5954 this'll report `cargo vet` status checks on PRs that
modify the `supply-chain` directory in addition to `Cargo.lock`
modifications that already happen.
I saw some PRs fail this step earlier today due to rate limits but it
ended up not failing the entire PR's CI due to it not being listed in
the final set of dependencies, so add it there.
* Don't run LLDB tests on PRs
These take an extra minute or so, so only run them on the full test
suite of a merge instead of on all PRs as well.
* Add a test for the x64 isa files
This guarantees that if cranelift's x64 backend is modified that the
tests will be run on a PR, even if other backends were also modified.
These mostly only validate changes to `Cargo.lock` so skip these checks
by default on PRs which generally never need to trigger them. If
`Cargo.lock` changes, however, then run them for PRs.
GitHub recently made its merge queue feature available for use in public
repositories owned by organizations meaning that the Wasmtime repository
is a candidate for using this. GitHub's Merge Queue feature is a system
that's similar to Rust's bors integration where PRs are tested before
merging and only passing PRs are merged. This implements the "not rocket
science" rule where the `main` branch of Wasmtime, for example, is
always tested and passes CI. This is in contrast to our current
implementation of CI where PRs are merged when they pass their own CI,
but the code that was tested is not guaranteed to be the state of `main`
when the PR is merged, meaning that we're at risk now of a failing
`main` branch despite all merged PRs being green. While this has
happened with Wasmtime this is not a common occurrence, however.
The main motivation, instead, to use GitHub's Merge Queue feature is
that it will enable Wasmtime to greatly reduce the amount of CI running
on PRs themselves. Currently the full test suite runs on every push to
every PR, meaning that our workers on GitHub Actions are frequently
clogged throughout weekdays and PRs can take quite some time to come
back with a successful run. Through the use of a Merge Queue, however,
we're able to configure only a small handful of checks to run on PRs
while deferring the main body of checks to happening on the
merge-via-the-queue itself. This is hoped to free up capacity on CI and
overall improve CI times for Wasmtime and Cranelift developers.
The implementation of all of this required quite a lot of plumbing and
retooling of our CI. I've been testing this in an [external
repository][testrepo] and I think everything is working now. A list of
changes made in this PR are:
* The `build.yml` workflow is merged back into the `main.yml` workflow
as the original reason to split it out is not longer applicable (it'll
run on all merges). This was also done to fit in the dependency graph
of jobs of one workflow.
* Publication of the `gh-pages` branch, the `dev` tag artifacts, and
release artifacts have been moved to a separate
`publish-artifacts.yml` workflow. This workflow runs on all pushes to
`main` and all tags. This workflow no longer actually preforms any
builds, however, and relies on a merge queue or similar being used for
branches/tags where artifacts are downloaded from the workflow run to
be uploaded. For pushes to `main` this works because a merge queue is
run meaning that by the time the push happens all artifacts are ready.
For release branches this is handled by..
* The `push-tag.yml` workflow is subsumed by the `main.yml` workflow. CI
for a tag being pushed will upload artifacts to a release in GitHub,
meaning that all builds must finish first for the commit. The
`main.yml` workflow at the end now scans commits for the preexisting
magical marker and pushes a tag if necessary.
* CI is currently a flat list of "run all these jobs" and this is now
rearchitected to a "fan out" approach where some jobs run to determine
the next jobs to run which then get "joined" into a finish step. The
purpose for this is somewhat nuanced and this has implications for CI
runtime as well. The Merge Queue feature requires branches to be
protected with "these checks must pass" and then the same checks are
gates both to enter the merge queue as well as pass the merge queue.
The saving grace, however, is that a "skipped" check counts as
passing, meaning checks can be skipped on PRs but run to completion on
the merge queue. A problem with this though is the build matrix used
for tests where PRs want to only run one element of the build matrix
ideally but there's no means on GitHub Actions right now for the
skipped entries to show up as skipped easily (or not that I know of).
This means that the "join" step serves the purpose of being the single
gate for both PR and merge queue CI and there's just more inputs to it
for merge queue CI. The major consequence of this decision is that
GitHub's actions scheduling doesn't work out well here. Jobs are
scheduled in a FIFO order meaning that the job for "ok complete the CI
run" is queued up after everything else has completed, possibly
after lots of other CI requests in the middle for other PRs. The hope
here is that by using a merge queue we can keep CI relatively under
control and this won't affect merge times too much.
* All jobs in the `main.yml` workflow will not automatically cancel the
entire run if they fail. Previously this fail-fast behavior was only
part of the matrix runs (and just for that matrix), but this is
required to make the merge queue expedient. The gate of the merge
queue is the final "join" step which is only executed once all
dependencies have finished. This means, for example, that if rustfmt
fails quickly then the tests which take longer might run for quite
awhile before the join step reports failure, meaning that the PR sits
in the queue for longer than needed being tested when we know it's
already going to fail. By having all jobs cancel the run this means
that failures immediately bail out and mark the whole job as
cancelled.
* A new "determine" CI job was added to determine what CI actually needs
to run. This is a "choke point" which is scheduled at the start of CI
that quickly figures out what else needs to be run. This notably
indicates whether large swaths of ci (the `run-full` flag) like the
build matrix are executed. Additionally this dynamically calculates a
matrix of tests to run based on a new `./ci/build-test-matrix.js`
script. Various inputs are considered for this such as:
1. All pushes, meaning merge queue branches or release-branch merges,
will run full CI.
2. PRs to release branches will run full CI.
3. PRs to `main`, the most common, determine what to run based on
what's modified and what's in the commit message.
Some examples for (3) above are if modifications are made to
`cranelift/codegen/src/isa/*` then that corresponding builder is
executed on CI. If the `crates/c-api` directory is modified then the
CMake-based tests are run on PRs but are otherwise skipped.
Annotations in commit messages such as `prtest:*` can be used to
explicitly request testing.
Before this PR merges to `main` would perform two full runs of CI: one
on the PR itself and one on the merge to `main`. Note that the one as a
merge to `main` was quite frequently cancelled due to a merge happening
later. Additionally before this PR there was always the risk of a bad
merge where what was merged ended up creating a `main` that failed CI to
to a non-code-related merge conflict.
After this PR merges to `main` will perform one full run of CI, the one
as part of the merge queue. PRs themselves will perform one test job
most of the time otherwise. The `main` branch is additionally always
guaranteed to pass tests via the merge queue feature.
For release branches, before this PR merges would perform two full
builds - one for the PR and one for the merge. A third build was then
required for the release tag itself. This is now cut down to two full
builds, one for the PR and one for the merge. The reason for this is
that the merge queue feature currently can't be used for our
wildcard-based `release-*` branch protections. It is now possible,
however, to turn on required CI checks for the `release-*` branch PRs so
we can at least have a "hit the button and forget" strategy for merging
PRs now.
Note that this change to CI is not without its risks. The Merge Queue
feature is still in beta and is quite new for GitHub. One bug that
Trevor and I uncovered is that if a PR is being tested in the merge
queue and a contributor pushes to their PR then the PR isn't removed
from the merge queue but is instead merged when CI is successful, losing
the changes that the contributor pushed (what's merged is what was
tested). We suspect that GitHub will fix this, however.
Additionally though there's the risk that this may increase merge time
for PRs to Wasmtime in practice. The Merge Queue feature has the ability
to "batch" PRs together for a merge but this is only done if concurrent
builds are allowed. This means that if 5 PRs are batched together then 5
separate merges would be created for the stack of 5 PRs. If the CI for
all 5 merged together passes then everything is merged, otherwise a PR
is kicked out. We can't easily do this, however, since a major purpose
for the merge queue for us would be to cut down on usage of CI builders
meaning the max concurrency would be set to 1 meaning that only one PR
at a time will be merged. This means PRs may sit in the queue for awhile
since previously many `main`-based builds are cancelled due to
subsequent merges of other PRs, but now they must all run to 100%
completion.
[testrepo]: https://github.com/bytecodealliance/wasmtime-merge-queue-testing
* Remove trailing whitespace in `lower.isle` files
* Legalize the `band_not` instruction into simpler form
This commit legalizes the `band_not` instruction into `band`-of-`bnot`,
or two instructions. This is intended to assist with egraph-based
optimizations where the `band_not` instruction doesn't have to be
specifically included in other bit-operation-patterns.
Lowerings of the `band_not` instruction have been moved to a
specialization of the `band` instruction.
* Legalize `bor_not` into components
Same as prior commit, but for the `bor_not` instruction.
* Legalize bxor_not into bxor-of-bnot
Same as prior commits. I think this also ended up fixing a bug in the
s390x backend where `bxor_not x y` was actually translated as `bnot
(bxor x y)` by accident given the test update changes.
* Simplify not-fused operands for riscv64
Looks like some delegated-to rules have special-cases for "if this
feature is enabled use the fused instruction" so move the clause for
testing the feature up to the lowering phase to help trigger other rules
if the feature isn't enabled. This should make the riscv64 backend more
consistent with how other backends are implemented.
* Remove B{and,or,xor}Not from cost of egraph metrics
These shouldn't ever reach egraphs now that they're legalized away.
* Add an egraph optimization for `x^-1 => ~x`
This adds a simplification node to translate xor-against-minus-1 to a
`bnot` instruction. This helps trigger various other optimizations in
the egraph implementation and also various backend lowering rules for
instructions. This is chiefly useful as wasm doesn't have a `bnot`
equivalent, so it's encoded as `x^-1`.
* Add a wasm test for end-to-end bitwise lowerings
Test that end-to-end various optimizations are being applied for input
wasm modules.
* Specifically don't self-update rustup on CI
I forget why this was here originally, but this is failing on Windows
CI. In general there's no need to update rustup, so leave it as-is.
* Cleanup some aarch64 lowering rules
Previously a 32/64 split was necessary due to the `ALUOp` being different
but that's been refactored away no so there's no longer any need for
duplicate rules.
* Narrow a x64 lowering rule
This previously made more sense when it was `band_not` and rarely used,
but be more specific in the type-filter on this rule that it's only
applicable to SIMD types with lanes.
* Simplify xor-against-minus-1 rule
No need to have the commutative version since constants are already
shuffled right for egraphs
* Optimize band-of-bnot when bnot is on the left
Use some more rules in the egraph algebraic optimizations to
canonicalize band/bor/bxor with a `bnot` operand to put the operand on
the right. That way the lowerings in the backends only have to list the
rule once, with the operand on the right, to optimize both styles of
input.
* Add commutative lowering rules
* Update cranelift/codegen/src/isa/x64/lower.isle
Co-authored-by: Jamey Sharp <jamey@minilop.net>
---------
Co-authored-by: Jamey Sharp <jamey@minilop.net>
This fixes the build issue identified in #5664 at the toolchain level
rather than working around it in our own build. The next step in fixing
this will be to remove the nightly override in the future when the
toolchain becomes stable.