cranelift

Commit Graph

Author	SHA1	Message	Date
wasmtime-publish	a528e0383e	Release Wasmtime 2.0.2 (#5218 ) * Release Wasmtime 2.0.2 [automatically-tag-and-release-this-commit] * Add release notes for 2.0.2 * Fix an invalid zero-sized call to mmap * Fix windows tests Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com> Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2 years ago
Alex Crichton	087d9d7bec	Merge pull request from GHSA-h84q-m8rr-3v9q The Rust definition was previously performing a 4-byte write when the C API was declared as taking an 1-byte buffer.	2 years ago
Alex Crichton	e60c374290	Merge pull request from GHSA-44mr-8vmm-wjhg This ensures that memories, even with zero contents, still have the necessary virtual mappings as required by the code generator to report out-of-bounds reads/writes.	2 years ago
Alex Crichton	2614f2e9d2	Merge pull request from GHSA-wh6w-3828-g9qf This is a minimal fix for the release branch to fix the issue of having a memory slot get reused between a module with an image and one without.	2 years ago
Alex Crichton	96ae44ac8f	[2.0.0] Fix CI builds (#5219 ) * Fix instruction size test for Rust 1.65.0 (#5188) Looks like Rust generously shrank our `enum` in 1.65.0, so update the test assertion to pass CI. * Use an alternate doxygen download link (#5150) * Use an alternate doxygen download link Looks like doxygen.nl is down otherwise. * Update link	2 years ago
Alex Crichton	516b9592d7	Fix the `push-tag` workflow again (#5144 ) [automatically-tag-and-release-this-commit]	2 years ago
wasmtime-publish	817e08dc5b	Release Wasmtime 2.0.1 (#5138 ) [automatically-tag-and-release-this-commit] Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2 years ago
Alex Crichton	2ebb8a53f0	Prepare a 2.0.1 release of Wasmtime (#5136 ) * Fix push tag workflow (#5082) This commit fixes the `push-tag.yml` workflow to work with the new `Cargo.toml` manifest since workspace inheritance was added. This additionally fixes some warnings coming up on CI about our usage of deprecated features on github actions. * Reduce warnings on CI from GitHub Actions (#5083) * Upgrade our github actions to "node16" Each github actions run has a lot of warnings about using node12 so this upgrades our repository to using node16. I'm hoping no other changes are needed and I suspect other actions we're using are on node12 and will need further updates, but this should help pin down what's remaining. * Update `actions/checkout` workflow to `v3` * Update to `actions/cache@v3` * Update to `actions/upload-artifact@v3` * Drop usage of `actions-rs/toolchain` * Update to `actions/setup-python@v4` * Update mdbook version * Add `package-lock.json` for `github-release` action (#5091) A local github action we have has been broken for about a month now meaning that the `dev` tag isn't getting updated or getting new releases. This appears to be due to the publication of new versions of these dependencies which are running into issues using one another. I think I've figured out versions that work and have added a `package-lock.json` to ensure we keep using the same versions. * More fixes for publish action (#5110) Looks like #5091 wasn't enough and some of the APIs needed updating with changes made in the meantime. I've updated the action here and additionally made a separate change where the release isn't continually created and deleted but instead left alone and only the tag is updated. This should work for the `dev` release and avoids deleting/recreating on each PR, sending out notifications for new releases. * Add missing `Win32_Foundation` feature This is necessary for the `wasmtime-runtime` crate to compile on Windows. * Add a note for the 2.0.1 release * Remove rayon dependency of cranelift-isle (#5101) Using rayon adds a lot of dependencies to Cranelift. The total unparallelized time the code that uses rayon takes is less than half a second and it runs at compile time, so there is pretty much no benefit to parallelizing it. * Add a note about rayon removal Co-authored-by: Christopher Serr <christopher.serr@gmail.com> Co-authored-by: bjorn3 <17426603+bjorn3@users.noreply.github.com>	2 years ago
wasmtime-publish	ff8c568eee	Release Wasmtime 2.0.0 (#5081 ) [automatically-tag-and-release-this-commit] Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>	2 years ago
Alex Crichton	0da585ac4f	Add release notes for 2.0.0 (#5048 ) (#5069 ) * Add release notes for 2.0.0 * Review comments	2 years ago
Chris Fallin	24eca3bc2a	Cherry-pick to 2.0 release branch: "Fix StructReturn handling: properly mark the clobber, and offset actual rets. (#5023 )" (#5029 ) * Fix StructReturn handling: properly mark the clobber, and offset actual rets. (#5023) * Fix StructReturn handling: properly mark the clobber, and offset actual rets. The legalization of `StructReturn` was causing issues in the new call-handling code: the `StructReturn` ret was included in the `SigData` as if it were an actual CLIF-level return value, but it is not. Prior to using regalloc constraints for return values, we unconditionally included rax (or the architecture's usual return register) as a def, so it would be properly handled as "clobbered" by the regalloc. With the new scheme, we include defs on the call only for CLIF-level outputs. Callees with `StructReturn` args were thus not known to clobber the return-value register, and values might be corrupted. This PR updates the code to include a `StructReturn` ret as a clobber rather than a returned value in the relevant spots. I observed it causing saves/restores of rax in some CLIF that @bjorn3 provided me, but I was having difficulty minimizing this into a test-case that I would be comfortable including as a precise-output case (including the whole thing verbatim would lock down a bunch of other irrelevant details and cause test-update noise later). If we can find a more minimized example I'm happy to include it as a filetest. Fixes #5018. * Disable wasi-nn CI tests due to breakage (404'ing package repository). In #5023 we are seeing a failing CI job (see [1]); after four attempted restarts, it 404's each time when trying to download OpenVino from the Intel apt mirrors. This PR temporarily removes the wasi-nn CI job from our CI configuration so that we have green CI and can merge other work. [1] https://github.com/bytecodealliance/wasmtime/actions/runs/3200861896/jobs/5228903240	2 years ago
Trevor Elliott	e63771f2d9	More consistent use of `add_inst` (#5012 ) Use the InstId returned by add_inst rather than creating it eagerly, when possible.	2 years ago
Trevor Elliott	a209cb63f5	ISLE: Enable the overlap checker (#5011 ) This PR turns the overlap checker on by default, requiring the use of priorities to resolve overlap between rules.	2 years ago
Alex Crichton	2607590d8c	Update the wasm-tools family of crates (#5010 ) * Update the wasm-tools family of crates Only minor updates here, mostly internal changes and no binary-related changes today. * Fix test expectation	2 years ago
Nathaniel McCallum	d986b3cbc2	feat: improve wasi_common::ErrorKind derives (#5006 ) Besides the standard traits (Copy, Clone, PartialEq and Eq), we also mark the trait as non-exhaustive so that we can add errors in the future without breaking API. Signed-off-by: Nathaniel McCallum <nathaniel@profian.com> Signed-off-by: Nathaniel McCallum <nathaniel@profian.com>	2 years ago
Alex Crichton	b07e619055	Drop a few crates from our dependency graph (#5009 ) A minor update of a few other crates drops `semver` and `rustc_version` from `Cargo.lock`. I've audited the deltas in versions for the other crates here as well and they all look good.	2 years ago
yuyang	07584f6ac8	fix issue 4996. (#5003 )	2 years ago
Johnnie Birch	f2105f95e9	Update perf action to work for wasmtime forks (#5007 )	2 years ago
Trevor Elliott	c9ff14e00b	Resolve overlap in the s390x backend (#5002 ) Resolve overlap in the s390x backend by adding rule priorities to disambiguate rule order.	2 years ago
Jamey Sharp	d35c508436	cranelift-frontend: Replace Vecs with ListPools (#5001 ) * Elide redundant sentinel values The `undef_variables` lists were a binding from Variable to Value, but the Values were always equal to a suffix of the block's parameters. So instead of storing another copy, we can just get them back from the block parameters. According to DHAT, this decreases total memory allocated and number of bytes written, and increases number of bytes read and instructions retired, but all by small fractions of a percent. According to hyperfine, main is "1.00 ± 0.01 times faster". * Use entity_impl for cranelift_frontend::Variable Instead of hand-coding essentially the same thing. * Keep undefined variables in a ListPool According to DHAT, this improves every measure of performance (instructions retired, total memory allocated, max heap size, bytes read, and bytes written), although by fractions of a percent. According to hyperfine the difference is nearly zero, but on Spidermonkey this branch is "1.01 ± 0.00 times faster" than main. * Elide redundant block IDs In a list of predecessors, we previously kept both the jump instruction that points to the current block, and the block where that instruction resides. But we can look up the block from the instruction as long as we have access to the current Layout, which we do everywhere that it was necessary. So don't store the block, just store the instruction. * Keep predecessor definitions in a ListPool * Make append_jump_argument independent of self This makes it easier to reason about borrow-checking issues. * Reuse `results` instead of re-doing variable lookup This eliminates three array lookups per predecessor by hanging on to the results of earlier steps a little longer. This only works now because I previously removed the need to borrow all of `self`, which otherwise prevented keeping a borrow of self.results alive. I had experimented with using `Vec::split_off` to copy the relevant chunk of results to a temporary heap allocation, but the extra allocation and copy was measurably slower. So it's important that this is just a borrow. * Cache single-predecessor block ID when sealing Of the code in cranelift_frontend, `use_var` is the second-hottest path, sitting close behind the `build` function that's used when inserting every new instruction. This makes sense given that the operands of a new instruction usually need to be looked up immediately before building the instruction. So making the single-predecessor loops in `find_var` and `use_var_local` do fewer memory accesses and execute fewer instructions turns out to have a measurable effect. It's still only a small fraction of a percent overall since cranelift-frontend is only a few percent of total runtime. This patch keeps a block ID in the SSABlockData, which is None unless both the block is sealed and it has exactly one predecessor. Doing so avoids two array lookups on each iteration of the two loops. According to DHAT, compared with main, at this point this PR uses 0.3% less memory at max heap, reads 0.6% fewer bytes, and writes 0.2% fewer bytes. According to Hyperfine, this PR is "1.01 ± 0.01 times faster" than main when compiling Spidermonkey. On the other hand, Sightglass says main is 1.01x faster than this PR on the same benchmark by CPU cycles. In short, actual effects are too small to measure reliably.	2 years ago
bjorn3	f1fce6c60d	Support writing riscv64 object files (#4995 )	2 years ago
Alex Crichton	7bab5c1b28	Consolidate module definition in `wasmtime-jit` (#5000 ) Minor thing I noticed from #4990 but I stylistically prefer to keep the `mod foo;` definitions canonicalized to one location to emphasize how multiple targets can use the same definition.	2 years ago
Yuyi Wang	6bcc430855	Initial work to build for Windows ARM64 (#4990 ) * Make wasmtime build for windows-aarch64 * Add check for win arm64 build. * Fix checks for winarm64 key in workflows. * Add target in windows arm64 build. * Add tracking issue for Windows ARM64 trap handling	2 years ago
Jamey Sharp	3fa545bd89	Cleanup cranelift-frontend (#4989 ) * cranelift-wasm: Assume block is reachable In handling the WebAssembly "end" operator, cranelift-wasm had logic to skip generating a jump instruction if the block was both unreachable and "pristine", meaning no instructions had been added. However, `translate_operator` checks first that `state.reachable` is true, so this logic only runs when cranelift-wasm believes that the current block _is_ reachable. Therefore the condition should always be true, whether the block is pristine or not. I've left a debug_assert in case `state.reachable` ever doesn't agree with `builder.is_unreachable()`, but the assert doesn't fail in any of the tests. We'll see if fuzzing finds something. Anyway, outside of cranelift-frontend, this eliminates the only use of `is_pristine()`, and there were no uses of `is_filled()`. So I've made both of those private. They're now only used in a nearby debug assert. * cranelift-frontend: Clarify pristine/filled states There was a comment here saying "A filled block cannot be pristine." Given that the intent was for those two states to be mutually exclusive, I've replaced the two booleans with a three-state enum. I also replaced all reads of these two flags with method calls. In all but one case these are only checked in debug assertions, so I don't even care whether they get inlined. They're easier to read, and this will make it easier to replace their implementations, which I hope to do soon. Finally, I replaced all assignments to either flag with an appropriate assignment of the corresponding enum state. Keep in mind this correspondence between the new enum and the old flags: - Empty: pristine true, filled false - Partial: pristine false, filled false - Filled: pristine false, filled true Every existing update to these flags could only move to a later state. (For example, Partial couldn't go back to Empty.) In the old flags that meant that pristine could only go from true to false, and filled could only go from false to true. `fill_current_block` was a weird case because at first glance it looks like it could allow both pristine and filled to be true at the same time. However, it's only called from `FuncInstBuilder::build`, which calls `ensure_inserted_block` before doing anything else, and _that_ cleared the pristine flag. Similarly, `handle_ssa_side_effects` looks like it could allow both pristine and filled to be true for anything in `split_blocks_created`. However, those blocks are created by SSABuilder, so their BlockData is not initialized by `create_block`, and instead uses BlockData::default. The `Default` implementation here previously set both flags false, while `create_block` would instead set pristine to true. So these split blocks were correctly set to the Filled state, and after this patch they are still set correctly. * cranelift-frontend: Separate SSA and user block params Previously there was a `user_param_count` field in BlockData, used purely to debug-assert that no user parameters are added to a block after `use_var` adds SSA parameters. Instead, this patch enforces a strict phase separation between the period after a block is created when user parameters can be added to it, and the period when `use_var` may be called and instructions may be added. I'm assuming that calls to `use_var` are _always_ followed by inserting one or more instructions into the block. (If you don't want to insert an instruction, why do you need to know where instructions in this block would get variable definitions from?) This patch has no visible effect for callers which follow that rule. However, it was previously legal to call `use_var`, then append a block parameter before adding instructions, so long as `use_var` didn't actually need to add a block parameter. That could only happen if the current block is sealed and has exactly one predecessor. So anyone who was counting on this behavior was playing a dangerous game anyway. * cranelift-frontend: Defer initializing block data Every reference to the func_ctx.status SecondaryMap will automatically create the appropriate entries on-demand, with the sole exception of `finalize`. In that function, debug assertions use SecondaryMap::keys to find out which blocks need to be checked. However, those assertions always succeed for blocks which never had any instructions added. So it's okay to skip them for blocks which aren't touched after `create_block`.	2 years ago
Trevor Elliott	ab4be2bdd1	ISLE: Resolve overlaps in the aarch64 backend (#4988 )	2 years ago
Trevor Elliott	c1d6ca48a7	ISLE: Resolve overlap in the riscv64 backend (#4982 ) Resolve overlap in the RiscV64 backend by adding priorities to rules. Additionally, one test updated as a result of this work, as a peephole optimization for addition with immediates fires now.	2 years ago
Jamey Sharp	77ab99d3b0	cranelift-frontend: SSA-building cleanup (#4984 ) * Cleanups to cranelift-frontend SSA construction * Encode sealed/undef_variables relationship in type A block can't have any undef_variables if it is sealed. It's useful to make that fact explicit in the types so that any time either value is used, it's clear that we should think about the other one too. In addition, encoding this fact in an enum type lets Rust apply an optimization that reduces the size of SSABlockData by 8 bytes, making it fit in a 64-byte cache line. I haven't taken the extra step of making SSABlockData be 64-byte aligned because 1) it doesn't seem to have a performance impact and b) doing so makes other structures quite a bit bigger. * Simplify finish_predecessors_lookup Using Vec::drain is more concise than a combination of iter().rev().take() followed by Vec::truncate. And in this case it doesn't matter what order we examine the results in, because we just want to know if they're all equal, so we might as well iterate forward instead of in reverse. There's no need for the ZeroOneOrMore enum. Instead, there are only two cases: either we have a single value to use for the variable (possibly synthesized as a constant zero), or we need to add a block parameter in every predecessor. Pre-filtering the results iterator to eliminate the sentinel makes it easy to identify how many distinct definitions this variable has. iter.next() indicates if there are any definitions at all, and then iter.all() is a clear way to express that we want to know if the remaining definitions are the same as the first one. * Simplify append_jump_argument * Avoid assigning default() into SecondaryMap This eliminates some redundant reads and writes. * cranelift-frontend: Construct with default() This eliminates a bunch of boilerplate in favor of a built in `derive` macro. Also I'm deleting an import that had the comment "FIXME: Remove in edition2021", which we've been using everywhere since April. * Fix tests	2 years ago
Trevor Elliott	46e42601eb	ISLE: Always default the priority to 0 (#4983 ) * Always default priorities to 0 in the ISLE IR * Fix a hidden overlap in the x64 backend	2 years ago
Trevor Elliott	7d5b2b5bb2	ISLE: Add the overlap_errors pragma (#4981 ) * Add the overlap_errors pragma to ISLE * Enable overlap errors in the x64 backend	2 years ago
Trevor Elliott	db06e4e622	ISLE: Resolve remaining x64 overlap errors (#4977 ) Resolve overlap errors with the x64 backend.	2 years ago
Trevor Elliott	bf2fa40e4e	Stabilize the error output order (#4976 )	2 years ago
Jamey Sharp	6c8620b688	Bypass state machine for single-predecessor chains (#4955 ) In the common case where there is a chain of sealed blocks that each have exactly one predecessor, we can keep track of any sub-sequence of those blocks in O(1) space. So there's no need to use the state machine stack to propagate variable definitions back along the chain. Instead, we can do one loop to find which block to stop at, then either get the variable definition from that block or introduce a block parameter there, and finally do one more loop to update variable definitions in all the intervening blocks. The existing implementation already had to do a graph traversal to propagate variable definitions correctly, so this doesn't visit any more blocks than before. However, this change also makes it possible to integrate cycle detection with the graph traversal. That eliminates the need for the in_predecessor_cycle flags, and any possibility of spiky performance profiles in maintaining those flags. As far as performance goes, this is all pretty much a wash: Changes to CPU time and CPU cycles are within noise, according to hyperfine and Sightglass/perf. But it's a substantially simpler implementation, with fewer invisible interactions between functions.	2 years ago
Trevor Elliott	2e954668c7	Sort overlap errors by position (#4975 )	2 years ago
Trevor Elliott	faf31f6216	ISLE: Resolve overlap in prelude.isle and x64/inst.isle (#4941 ) Resolve overlap in the ISLE prelude and the x64 inst module by introducing new types that allow better sharing of extractor resuls, or falling back on priorities.	2 years ago
Alex Crichton	2ba604e406	Update spec test repo (#4974 ) * Update spec test repo Our submodule was accidentally reverted to an older commit as part of #4271 and while it could be updated to as it was before I went ahead and updated it to `main`. * Update ignore directives and test multi-memory * Update riscv ignores	2 years ago
Ulrich Weigand	83bb6fd4c8	s390x: Fix regalloc checker error (#4973 ) For ShiftRR and VecShiftRR, if shift_reg is zero_reg(), the instruction does not actually use any register value. Fixes #4969	2 years ago
yuyang-ok	cdecc858b4	add riscv64 backend for cranelift. (#4271 ) Add a RISC-V 64 (`riscv64`, RV64GC) backend. Co-authored-by: yuyang <756445638@qq.com> Co-authored-by: Chris Fallin <chris@cfallin.org> Co-authored-by: Afonso Bordado <afonsobordado@az8.co>	2 years ago
Jamey Sharp	9715d91c50	Resolve aliases before checking for unique values (#4966 ) At control-flow join points, cranelift-frontend's SSA builder currently checks to see if only one definition of a variable reaches the current block. If so, it can eliminate the corresponding block parameter and use the original def directly. It implements this by turning the block parameter into an alias for the original value. However, it didn't resolve aliases during this check, except after it had already determined that there was only one definition. Resolving aliases first instead allows it to detect that more block parameters are redundant. And as more block parameters get converted to aliases, later blocks can see common definitions from further away, so this has a compounding effect. This also merges a special case, where there's exactly one unique non-sentinel definition but it's actually an alias for the sentinel, into the general case where all definitions are from the sentinel. As a result there's only one case that has to introduce a definition of the variable to zero. According to `valgrind --tool=dhat`, this is a significant memory savings. On the pulldown-cmark benchmark from Sightglass: - 15.3% (1.9MiB) less memory allocated at maximum heap - 4.1% (6.7MiB) less memory allocated in total - 9.8% (57MiB) fewer bytes read - 12.6% (36MiB) fewer bytes written - 5.4% fewer instructions retired - 1.04x faster by instructions retired (per Sightglass/perf) - 1.03x to 1.04x faster by CPU cycles (per Sightglass/perf) - 1.03 ± 0.01 times faster by CPU time (per hyperfine) - 1.04x faster by cache accesses (per Sightglass/perf) On the bz2 benchmark: - 1.06x faster by instructions retired (per Sightglass/perf) - 1.05x faster by CPU cycles (per Sightglass/perf) - 1.04 ± 0.01 times faster by CPU time (per hyperfine) - 1.02x to 1.03x faster by cache accesses (per Sightglass/perf) Even on the largest benchmark in Sightglass (spidermonkey.wasm), this is a measurable improvement: - 1.03x faster by instructions retired (per Sightglass/perf) - 1.02x faster by CPU cycles (per Sightglass/perf) - 1.02 ± 0.00 times faster by CPU time (per hyperfine) There was no significant difference in cache misses for any benchmark, according to Sightglass/perf.	2 years ago
Alex Crichton	29c7de7340	Update wasm-tools dependencies (#4970 ) * Update wasm-tools dependencies This update brings in a number of features such as: * The component model binary format and AST has been slightly adjusted in a few locations. Names are dropped from parameters/results now in the internal representation since they were not used anyway. At this time the ability to bind a multi-return function has not been exposed. * The `wasmparser` validator pass will now share allocations with prior functions, providing what's probably a very minor speedup for Wasmtime itself. * The text format for many component-related tests now requires named parameters. * Some new relaxed-simd instructions are updated to be ignored. I hope to have a follow-up to expose the multi-return ability to the embedding API of components. * Update audit information for new crates	2 years ago
Afonso Bordado	10deb9b7fe	fuzzgen: Add `fcvt_*` ops (#4958 )	2 years ago
Afonso Bordado	65a3af72c7	fuzzgen: Statistics framework (#4868 ) * cranelift: Add non user trap codes function * cranelift: Add Fuzzgen stats * cranelift: Use `once_cell` and cleanup some stuff * fuzzgen: Remove total_inputs metric * fuzzgen: Filter empty trap codes	2 years ago
Alex Crichton	ee2ef5bdd0	Tidy up some more deps with workspace dependencies (#4965 ) I missed these in #4905 and noticed just now but wanted to handle them to ensure we uniformly handle deps in the workspace.	2 years ago
Alex Crichton	84994203a1	Increase the `sigaltstack` stack size (#4964 ) This commit updates the `MIN_STACK_SIZE` constant for Unix platforms when allocating a sigaltstack from 16k to 64k. The signal handler captures a wasm `Backtrace` which involves memory allocations and it was recently discovered that, at least in debug mode, jemalloc can take up to 16k of stack space for an allocation. To allow running the sigaltstack size is increased here.	2 years ago
Alex Crichton	f12ef84cdc	Remove `handling_trap` variable (#4963 ) This historically was used to guard against recursive faults but later refactorings have made this variable somewhat obsolete. The code that it still protects is not the "meat" of trap handling. Instead the `jmp_buf_if_trap` is changed to be more like "take" so once a "take" succeeds it won't be able to recursively call any more "meat". Overall this shouldn't affect anything, it's just a small internal cleanup.	2 years ago
Alex Crichton	7b311004b5	Leverage Cargo's workspace inheritance feature (#4905 ) * Leverage Cargo's workspace inheritance feature This commit is an attempt to reduce the complexity of the Cargo manifests in this repository with Cargo's workspace-inheritance feature becoming stable in Rust 1.64.0. This feature allows specifying fields in the root workspace `Cargo.toml` which are then reused throughout the workspace. For example this PR shares definitions such as: * All of the Wasmtime-family of crates now use `version.workspace = true` to have a single location which defines the version number. * All crates use `edition.workspace = true` to have one default edition for the entire workspace. * Common dependencies are listed in `[workspace.dependencies]` to avoid typing the same version number in a lot of different places (e.g. the `wasmparser = "0.89.0"` is now in just one spot. Currently the workspace-inheritance feature doesn't allow having two different versions to inherit, so all of the Cranelift-family of crates still manually specify their version. The inter-crate dependencies, however, are shared amongst the root workspace. This feature can be seen as a method of "preprocessing" of sorts for Cargo manifests. This will help us develop Wasmtime but shouldn't have any actual impact on the published artifacts -- everything's dependency lists are still the same. * Fix wasi-crypto tests	2 years ago
bjorn3	af226d37c2	[AArch64] Fix incorrect regalloc constraints for atomic_cas (#4959 ) * [AArch64] Fix incorrect regalloc constraints for atomic_cas * Update test for latest Cranelift changes	2 years ago
Damian Heaton	3a2b32bf4d	Port branches to ISLE (AArch64) (#4943 ) * Port branches to ISLE (AArch64) Ported the existing implementations of the following opcodes for AArch64 to ISLE: - `Brz` - `Brnz` - `Brif` - `Brff` - `BrIcmp` - `Jump` - `BrTable` Copyright (c) 2022 Arm Limited * Remove dead code Copyright (c) 2022 Arm Limited	2 years ago
Jamey Sharp	11e90049d2	Fix wasmtime-bench-api build (#4956 ) This fixes a compile-time error introduced in #4207. The `?` operator doesn't work inside `Option::map` because it tries to return from the inner closure, not the outer function. Apparently our CI doesn't build wasmtime-bench-api so it didn't catch this issue.	2 years ago
Jamey Sharp	bd870a9d6c	Shrink all SmallVecs by 8 bytes (#4951 ) We weren't using the "union" cargo feature for the smallvec crate, which reduces the size of a SmallVec by one machine word. This feature requires Rust 1.49 but we already require much newer versions. When using Wasmtime to compile pulldown-cmark from Sightglass, this saves a decent amount of memory allocations and writes. According to `valgrind --tool=dhat`: - 6.2MiB (3.69%) less memory allocated over the program's lifetime - 0.5MiB (4.13%) less memory allocated at maximum heap size - 5.5MiB (1.88%) fewer bytes written to - 0.44% fewer instructions executed Sightglass reports a statistically significant runtime improvement too: compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 24379323.60 ± 20051394.04 (confidence = 99%) shrink-abiarg-0406da67c.so is 1.01x to 1.13x faster than main-be690a468.so! [227506364 355007998.78 423280514] main-be690a468.so [227686018 330628675.18 406025344] shrink-abiarg-0406da67c.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm Δ = 360151622.56 ± 278294316.90 (confidence = 99%) shrink-abiarg-0406da67c.so is 1.01x to 1.07x faster than main-be690a468.so! [8709162212 8911001926.44 9535111576] main-be690a468.so [5058015392 8550850303.88 9282148438] shrink-abiarg-0406da67c.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm Δ = 6936570.28 ± 6897696.38 (confidence = 99%) shrink-abiarg-0406da67c.so is 1.00x to 1.08x faster than main-be690a468.so! [155810934 175260571.20 234737344] main-be690a468.so [119128240 168324000.92 257451074] shrink-abiarg-0406da67c.so	2 years ago
Roman Volosatovs	95c74ef246	feat: rely on `tracing-subscriber` in tests (#4950 ) `tracing` crate is already used within the codebase, this change allows developers to benefit from that functionality when running and debugging tests Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net> Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>	2 years ago

1 2 3 4 5 ...

10380 Commits (release-2.0.0) All Branches Search

10380 Commits (release-2.0.0)

All Branches