cranelift

Commit Graph

Author	SHA1	Message	Date
Roman Volosatovs	95c74ef246	feat: rely on `tracing-subscriber` in tests (#4950 ) `tracing` crate is already used within the codebase, this change allows developers to benefit from that functionality when running and debugging tests Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net> Signed-off-by: Roman Volosatovs <rvolosatovs@riseup.net>	2 years ago
Andrew Brown	5a288c2c40	bench-api: configure WASI modules based on passed flags (#4207 ) * bench-api: configure WASI modules based on passed flags When benchmarking in Sightglass, @brianjjones has found it necessary to enable the wasi-nn module. The current way to do so is to alter the engine build script to pass `--features wasi-nn` so that this crate can run code relying on these imports. This change allows the user to instead pass the WASI modules using the engine flags added in #4096. This could look something like the following in Sightglass: ``` sightglass-cli benchmark ... --engine-flags '--wasi-modules experimental-wasi-nn' ``` * fix: disable wasi-crypto as a default feature	2 years ago
Afonso Bordado	bb6a8a717a	fuzzgen: Avoid `int_divz` traps (#4932 ) * fuzzgen: Insert `int_divz` sequence * fuzzgen: matches!	2 years ago
Jamey Sharp	6e76e925f4	Avoid quadratic behavior in `can_optimize_var_lookup` (#4939 ) * cranelift-frontend: Avoid quadratic behavior Fixes #4923. * Improve comments and debug assertions * Improve comments One thing that's especially neat about this PR is that, unlike the `can_optimize_var_lookup` graph traversal, `update_predecessor_cycle` doesn't need to keep track of all the blocks it has visited in order to detect cycles. However, the reasons why are subtle and need careful documentation. Also neat: We've previously tried keeping either a HashSet or a SecondaryMap around to re-use the same heap allocation for the `visited` set, which needs space linear in the number of blocks. After this PR, we're still using space that's linear in the number of blocks to store the `in_predecessor_cycle` flag, but that flag fits inside existing padding in `SSABlockData`, so it's a net savings in memory consumption. * Avoid quadratic behavior in `update_predecessor_cycle` So far I hadn't really eliminated the quadratic behavior from `can_optimize_var_lookup`. I just moved it to happen when the CFG is modified instead, and switched to indexing directly into the vector of blocks instead of going through a HashSet. I suspect the latter change is always a win, but the former is only an improvement assuming that `use_var` is called more often than `declare_block_predecessor`. But @cfallin pointed out that it feels like we should be able to do better by taking advantage of the knowledge that once a block is sealed, its predecessors can't change any more. That's not completely trivial to do because changes to the property we care about propagate toward successors, and we're only keeping pointers to predecessors. Still, as long as frontends follow the existing recommendation to seal blocks as soon as possible, maintaining a conservative approximation using only local information works fine in practice. This significantly limits the situations where this graph traversal could visit a lot of the CFG. * Review comments	2 years ago
Tobias Bradtke	be690a468d	Fix typo (#4946 )	2 years ago
Chris Fallin	19bd8687ac	Upgrade to regalloc2 0.4.1. (#4945 ) * Upgrade to regalloc2 0.4.1. Incorporates bytecodealliance/regalloc2#85, which fixes a fuzzbug related to constraints and liverange splits. * Add audit of regalloc2 upgrade.	2 years ago
Damian Heaton	3f8cccfb59	Port flag-based ops to ISLE (AArch64) (#4942 ) Ported the existing implementations of the following opcodes for AArch64 to ISLE: - `Trueif` - `Trueff` - `Trapif` - `Trapff` - `Select` - `Selectif` - `SelectifSpectreGuard` Copyright (c) 2022 Arm Limited	2 years ago
Chris Fallin	89abd80c3c	Add the aegraph (acyclic e-graph) implementation crate. (#4909 ) * Add the aegraph (acyclic egraph) implementation crate. * fix crate-dep version for cranelift-entity (rebase error) * Review feedback. * Fix link in Markdown doc comment. * Doc link fix again. * add cranelift-egraph to publish list.	2 years ago
Chris Fallin	b652ce2fb1	ISLE: add support for multi-extractors and multi-constructors. (#4908 ) * ISLE: add support for multi-extractors and multi-constructors. This support allows for rules that process multiple matching values per extractor call on the left-hand side, and as a result, can produce multiple values from the constructor whose body they define. This is useful in situations where we are matching on an input data structure that can have multiple "nodes" for a given value or ID, for example in an e-graph. * Review feedback: all multi-ctors and multi-etors return iterators; no `Vec` case. * Add additional warning suppressions to generated-code toplevels to be consistent with new islec output.	2 years ago
Trevor Elliott	b167172715	Add an overlap checker to ISLE (#4906 ) https://github.com/bytecodealliance/wasmtime/pull/4906 Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2 years ago
Dan Gohman	6f50ddaaf2	Update to cap-std 0.26. (#4940 ) * Update to cap-std 0.26. This is primarily to pull in bytecodealliance/cap-std#271, the fix for #4936, compilation on Rust nightly on Windows. It also updates to rustix 0.35.10, to pull in bytecodealliance/rustix#403, the fix for bytecodealliance/rustix#402, compilation on newer versions of the libc crate, which changed a public function from `unsafe` to safe. Fixes #4936. * Update the system-interface audit for 0.23. * Update the libc supply-chain config version.	2 years ago
Nick Fitzgerald	b2d13ebd46	Revert "Memoize `can_optimize_var_lookup` (#4924 )" (#4937 ) This reverts commit `562bb25360`.	2 years ago
Damian Heaton	352c7595c6	Improve `fcvt_to_{u,s}int_sat` lowering (AArch64) (#4913 ) Improved the instruction lowering for the following opcodes on AArch64, and introduced support for converting to integers less than 32-bits wide as per the docs: - `FcvtToSintSat` - `FcvtToUintSat` Copyright (c) 2022 Arm Limited	2 years ago
Damian Heaton	e786bda002	Vector bitcast support (AArch64 & Interpreter) (#4820 ) * Vector bitcast support (AArch64 & Interpreter) Implemented support for `bitcast` on vector values for AArch64 and the interpreter. Also corrected the verifier to ensure that the size, in bits, of the input and output types match for a `bitcast`, per the docs. Copyright (c) 2022 Arm Limited * `I128` same-type bitcast support Copyright (c) 2022 Arm Limited * Directly return input for 64-bit GPR<=>GPR bitcast Copyright (c) 2022 Arm Limited	2 years ago
Chris Fallin	05cbd667c7	Cranelift: use regalloc2 constraints on caller side of ABI code. (#4892 ) * Cranelift: use regalloc2 constraints on caller side of ABI code. This PR updates the shared ABI code and backends to use register-operand constraints rather than explicit pinned-vreg moves for register arguments and return values. The s390x backend was not updated, because it has its own implementation of ABI code. Ideally we could converge back to the code shared by x64 and aarch64 (which didn't exist when s390x ported calls to ISLE, so the current situation is underestandable, to be clear!). I'll leave this for future work. This PR exposed several places where regalloc2 needed to be a bit more flexible with constraints; it requires regalloc2#74 to be merged and pulled in. * Update to regalloc2 0.3.3. In addition to version bump, this required removing two asserts as `SpillSlot`s no longer carry their class (so we can't assert that they have the correct class). * Review comments. * Filetest updates. * Add cargo-vet audit for regalloc2 0.3.2 -> 0.3.3 upgrade. * Update to regalloc2 0.4.0.	2 years ago
Bryant Luk	8b245178a5	Update Rust lang doc for 1.0.0 dependency (#4935 )	2 years ago
Damian Heaton	cae7c196bb	Interpreter: Implement floating point conversions (#4884 ) * Interpreter: Implement floating point conversions Implemented the following opcodes for the interpreter: - `FcvtToUint` - `FcvtToSint` - `FcvtToUintSat` - `FcvtToSintSat` - `FcvtFromUint` - `FcvtFromSint` - `FcvtLowFromSint` - `FvpromoteLow` - `Fvdemote` Copyright (c) 2022 Arm Limited * Fix `I128` bounds checks for `FcvtTo{U,S}int{_,Sat}` Copyright (c) 2022 Arm Limited * Fix broken test Copyright (c) 2022 Arm Limited	2 years ago
Alex Crichton	63c9e5d46d	Allow empty commits for the release (#4927 ) The release process failed last night due to me filling out the dates in the release notes early (rather than leaving "Unreleased") which mean there were no changes for each commit. Switch to passing `--allow-empty` when making a commit to prevent this.	2 years ago
Johnnie Birch	a434f43d22	Update perf.yml token used to access perf repo (#4919 )	2 years ago
Adam Bratschi-Kaye	562bb25360	Memoize `can_optimize_var_lookup` (#4924 ) * Memoize `can_optimize_var_lookup` `can_optimize_var_lookup` can have quadratic behavior if there is a chain of blocks each containing a `local.get` instruction because each run can walk up the entire chain. This change memoizes the results of `can_optimize_var_lookup` so that we can stop following the chain of predecessors when we hit a block that has previously been handled (making the operation linear again).	2 years ago
Alex Crichton	b8fa068ca8	Limit linear memories when fuzzing with pooling (#4918 ) This commit limits the maximum number of linear memories when the pooling allocator is used to ensure that the virtual memory mapping for the pooling allocator itself can succeed. Currently there are a number of crashes in the differential fuzzer where the pooling allocator can't allocate its mapping because the maximum specified number of linear memories times the number of instances exceeds the address space presumably.	2 years ago
Cheng Shao	f5580954af	Add --disable-parallel-compilation CLI flag (#4911 )	2 years ago
Dan Gohman	cbd2efd236	Optimize the WASI `random_get` implementation. (#4917 ) * Optimize the WASI `random_get` implementation. Use `StdRng` instead of the `OsRng` in the default implementation of `random_get`. This uses a userspace CSPRNG, making `random_get` 3x faster in simple benchmarks. * Update cargo-vet audits for cap-std 0.25.3. * Update all cap-std packages to 0.25.3.	2 years ago
Johnnie Birch	27435ae398	Adds a github action to support x64 performance testing using a sightglass (#4421 ) * Adds a github action to support x64 performance testing using a sightglass This github action allows performance testing using sightglass. The action is triggered either via a workflow dispatch or with the comment '/bench_x64', in a pull request. Once triggered the action will send a request to a private repository that supports using a self-hosted runner to do comparisons of "refs/feature/commit" vs "refs/heads/main" for wasmtime. If the action is triggered via a comment in a pull request (with '/bench_x64') then the commit referenced by the pull request is used for the comparison against refs/head/main. If triggered via a workflow dispatch the interface will request the commit to compare against refs/head/main. The results of the performance tests, run via sightglass, will be a table showing a percentage change in clock ticks in various stages requried for executing the benchmark, namely instantiate, compiliation, and execution. This patch is intended to be just a starting patch with much to tweak and improve. One of the TODOs will be adding support for aarch64 .. currently this patch supports only x64. Note also that the logic for actually doing the comparison and parsing the results occurs with the action associated with the private repo and so this patch itself (though the trigger) is fairly straight forward. * Refactor patch to consolidate all steps to here. * Remove unused code * Remvoes unused pull_request_review_comment trigger * Match trigger word when contained anywhere in the pull request review message * Remove redundant repo and ref variables for wasmtime_commit * Minor comment update * Remove command to install jq * Remove printing of git config variables being used * Fix token for posting results * Update message explaining pct_change for benchmark results * Revert TOKEN for publsh change * Update message explaining results	2 years ago
Afonso Bordado	09f46e351e	fuzzgen: Mostly Forward Branching (#4894 ) * cranelift: Test Forward branching * fuzzgen: Separate terminators * fuzzgen: Avoid generating jumptables if we have no valid targets * fuzzgen: Forward Jump Tables * fuzzgen: Cleanup some feedback Thanks @jameysharp! * fuzzgen: Cleanup block generation Thanks @jameysharp! * fuzzgen: Style Cleanups These were accidentally reverted in a rebase * fuzzgen: Prevent block0 from being targeted for branches * fuzzgen: Add jump tables sorting TODO	2 years ago
Trevor Elliott	9d99eff6f9	Flatten `and` patterns in ISLE (#4915 ) Flatten nested and patterns into a single vector in the ISLE front-end.	2 years ago
Afonso Bordado	2db7d7a8e0	fuzzgen: Disable verifier after NaN Canonicalization (#4914 ) * fuzzgen: Disable verifier after NaN Canonicalization We are currently running the verifier twice, once after the nan canonicalization pass, and again when JIT compiling the code. The verifier first runs in the NaN Canonicalization pass. If it fails it prevents us from getting a nice `cargo fuzz fmt` test case. So disable the verifier there, but ensure its enabled when JIT compiling. * fuzzgen: Force enable verifier in cranelift-icache This is already the default, but since we no longer run the verifier in `fuzzgen` its important to ensure that it runs in the fuzz targets.	2 years ago
Afonso Bordado	d0b98aa25f	cranelift: Prepare fuzzgen for AArch64 (#4867 ) * cranelift: Re-enable some shift operations * fuzzgen: Disable Some FloatCC's for AArch64 * cranelift: Disable i128 divs on aarch64 * cranelift: Centralize IntCC selection	2 years ago
Alex Crichton	76c93a3906	Remove a debug utility in the publish script (#4904 ) This was something I used for a one-time bump to 2.0, but is no longer necessary. I didn't mean to commit this but I forgot to back it out.	2 years ago
Damian Heaton	e9b08b856d	Port `icmp` to ISLE (AArch64) (#4898 ) * Port `icmp` to ISLE (AArch64) Ported the existing implementation of `icmp` (and, by extension, the `lower_icmp` function) to ISLE for AArch64. Copyright (c) 2022 Arm Limited * Allow 'producer chains', eliminating `Nop0`s Copyright (c) 2022 Arm Limited	2 years ago
Andrew Brown	c3f8415ac7	fuzz: improve the spec interpreter (#4881 ) * fuzz: improve the API of the `wasm-spec-interpreter` crate This change addresses key parts of #4852 by improving the bindings to the OCaml spec interpreter. The new API allows users to `instantiate` a module, `interpret` named functions on that instance, and `export` globals and memories from that instance. This currently leaves the existing implementation ("instantiate and interpret the first function in a module") present under a new name: `interpret_legacy`. * fuzz: adapt the differential spec engine to the new API This removes the legacy uses in the differential spec engine, replacing them with the new `instantiate`-`interpret`-`export` API from the `wasm-spec-interpreter` crate. * fix: make instance access thread-safe This changes the OCaml-side definition of the instance so that each instance carries round a reference to a "global store" that's specific to that instantiation. Because everything is updated by reference there should be no visible behavioural change on the Rust side, apart from everything suddenly being thread-safe (modulo the fact that access to the OCaml runtime still needs to be locked). This fix will need to be generalised slightly in future if we want to allow multiple modules to be instantiated in the same store. Co-authored-by: conrad-watt <cnrdwtt@gmail.com> Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2 years ago
Trevor Elliott	024cad7e3d	Remove function_alignment from ObjectBuilder (#4888 ) Removes the function_alignment field from ObjectBuilder and ObjectModule. Alignment information is now provided either by the Module trait for minimum function alignment requirements, or on FunctionInfo for fucntion specific alignment requirements.	2 years ago
Trevor Elliott	ad09c273c6	Don't merge loads for xmm registers (#4891 ) Do not merge loads for xmm registers, as alignment requirements currently aren't satisfied with clif lowered from wasm. Fixes #4890	2 years ago
Afonso Bordado	555309a480	fuzzgen: Continue execution on traps (#4895 )	2 years ago
Afonso Bordado	bb3aae740a	fuzzgen: Panic on failed NaN Canonicalization pass (#4896 ) This should never fail anyway, but it's good to know that we aren't accidentally ignoring an input	2 years ago
Daniel Marin	71fd873946	Fix typo in examples-markdown.md (#4893 )	2 years ago
Chris Fallin	96bfd4e8c0	s390x: update some regalloc metadata to remove use of `reg_mod`. (#4856 ) * s390x: update some regalloc metadata to remove use of `reg_mod`. This is a step toward ultimately removing modify-operands, which along with removal of pinned vregs, lets us move to a completely constraint-based and fully-SSA regalloc input and get some nice advantages eventually. There are still a few uses of `mod` operands and pinned vregs remaining, especially around the "regpair" abstraction. Those proved to be a bit trickier to update though, so will have to be done separately. * Review feedback: restore two-arg pretty-print form. * Review feedback.	2 years ago
Chris Fallin	2986f6b0ff	ABI: implement register arguments with constraints. (#4858 ) * ABI: implement register arguments with constraints. Currently, Cranelift's ABI code emits a sequence of moves from physical registers into vregs at the top of the function body, one for every register-carried argument. For a number of reasons, we want to move to operand constraints instead, and remove the use of explicitly-named "pinned vregs"; this allows for better regalloc in theory, as it removes the need to "reverse-engineer" the sequence of moves. This PR alters the ABI code so that it generates a single "args" pseudo-instruction as the first instruction in the function body. This pseudo-inst defs all register arguments, and constrains them to the appropriate registers at the def-point. Subsequently the regalloc can move them wherever it needs to. Some care was taken not to have this pseudo-inst show up in post-regalloc disassemblies, but the change did cause a general regalloc "shift" in many tests, so the precise-output updates are a bit noisy. Sorry about that! A subsequent PR will handle the other half of the ABI code, namely, the callsite case, with a similar preg-to-constraint conversion. * Update based on review feedback. * Review feedback.	2 years ago
Chris Fallin	13c7846815	Cranelift: add a vreg limit check to correctly return an error on too-large inputs. (#4882 ) Previously, Cranelift panicked (via a a panic in regalloc2) when the virtual-register limit of 2M (2^21) was reached. This resulted in a perplexing and unhelpful failure when the user provided a too-large input (such as the Wasm module in #4865). This PR adds an explicit check when allocating vregs that fails with a "code too large" error when the limit is hit, producing output such as (on the minimized testcase from #4865): ``` Error: failed to compile wasm function 3785 at offset 0xa3f3 Caused by: Compilation error: Code for function is too large ``` Fixes #4865.	2 years ago
Alex Crichton	ef5ad26ab2	Update release notes for 1.0 (#4885 )	2 years ago
Anton Kirilov	d8b290898c	Initial forward-edge CFI implementation (#3693 ) * Initial forward-edge CFI implementation Give the user the option to start all basic blocks that are targets of indirect branches with the BTI instruction introduced by the Branch Target Identification extension to the Arm instruction set architecture. Copyright (c) 2022, Arm Limited. * Refactor `from_artifacts` to avoid second `make_executable` (#1) This involves "parsing" twice but this is parsing just the header of an ELF file so it's not a very intensive operation and should be ok to do twice. * Address the code review feedback Copyright (c) 2022, Arm Limited. Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2 years ago
Trevor Elliott	caad14826c	Rework the ISA flag checking extractors for x64 (#4878 ) Using fallible extractors that produce no values for flag checks means that it's not possible to pattern match cases where those flags are false. This change reworks the existing flag-checking extractors to be infallible, returning the flag's boolean value from the context instead.	2 years ago
Andrew Brown	f063082474	x64: remove `Inst::XmmLoadConst` (#4876 ) This is a cherry-pick of a long-ago commit, `2d46637`. The original message reads: > Now that `SyntheticAmode` can refer to constants, there is no longer a > need for a separate instruction format--standard load instructions will > work. Since then, the transition to ISLE and the use of `XmmLoadConst` in many more places makes this change a larger diff than the original. The basic idea is the same, though: the extra indirection of `Inst::XMmLoadConst` is removed and replaced by a direct use of `VCodeConstant` as a `SyntheticAmode`. This has no effect on codegen, but the CLIF output is now clearer in that the actual instruction is displayed (e.g., `movdqu`) instead of a made-up instruction (`load_const`).	2 years ago
Jamey Sharp	e694a6f5d4	Allocate less while constructing cranelift-fuzzgen tests (#4863 ) * Improve panic message if typevar_operand is None * cranelift-fuzzgen: Don't allocate for each choice I don't think the performance of test-case generation is at all important here. I'm actually doing this in preparation for a bigger refactor where I want to be able to borrow the list of valid choices for a given opcode without worrying about lifetimes. * cranelift-fuzzgen: Remove next_func_index It's only used locally within `generate_funcrefs`, so it doesn't need to be in the FunctionBuilder struct. Also there's already a local counter that I think is good enough for this. As far as I know, the function indexes only need to be distinct, not contiguous. * cranelift-fuzzgen: Separate resources from config The function-global variables, blocks, etc that are generated before generating instructions are all owned collections without any lifetime parameters. By contrast, the Unstructured and Config are both borrowed. Separating them will make it easier to borrow from the owned resources.	2 years ago
Afonso Bordado	f57b4412ec	cranelift: Implement missing i128 rotates on AArch64 (#4866 )	2 years ago
Anton Kirilov	dd07e354b4	Cranelift AArch64: Fix the get_return_address lowering (#4851 ) The previous implementation assumed that nothing had clobbered the LR register since the current function had started executing, so it would be incorrect for a non-leaf function, for example, that contains the `get_return_address` operation right after a call. The operation is valid only if the `preserve_frame_pointers` flag is enabled, which implies that the presence of a frame record on the stack is guaranteed. Copyright (c) 2022, Arm Limited.	2 years ago
Afonso Bordado	e977f6a79d	cranelift: Generate Store and Loads in fuzzgen (#4824 )	2 years ago
Jamey Sharp	b8b2fadea8	cranelift-fuzzgen: Consume all trailing fuzz input (#4862 ) But don't keep going once we've consumed it all.	2 years ago
Jamey Sharp	3d6d49daba	cranelift: Remove of/nof overflow flags from icmp (#4879 ) * cranelift: Remove of/nof overflow flags from icmp Neither Wasmtime nor cg-clif use these flags under any circumstances. From discussion on #3060 I see it's long been unclear what purpose these flags served. Fixes #3060, fixes #4406, and fixes #4875... by deleting all the code that could have been buggy. This changes the cranelift-fuzzgen input format by removing some IntCC options, so I've gone ahead and enabled I128 icmp tests at the same time. Since only the of/nof cases were failing before, I expect these to work. * Restore trapif tests It's still useful to validate that iadd_ifcout's iflags result can be forwarded correctly to trapif, and for that purpose it doesn't really matter what condition code is checked.	2 years ago
Andrew Brown	cd982c5a3f	[fuzz] Add SIMD to single-instruction generator (#4778 ) * [fuzz] Add SIMD to single-instruction generator This change extends the single-instruction generator with most of the SIMD instructions. Examples of instructions that were excluded are: all memory-related instructions, any instruction with an immediate. * [fuzz] Generate V128s with known values from each type To better cover the fuzzing search space, `DiffValue` will generate better known values for the `V128` type. First, it uses arbitrary data to select a sub-type (e.g., `I8x16`, `F32x4`, etc.) and then it fills in the bytes by generating biased values for each of the lanes. * [fuzz] Canonicalize NaN values in SIMD lanes This change ports the NaN canonicalization logic from `wasm-smith` ([here]) to the single-instruction generator. [here]: https://github.com/bytecodealliance/wasm-tools/blob/6c127a6/crates/wasm-smith/src/core/code_builder.rs#L927	2 years ago

... 3 4 5 6 7 ...

10531 Commits (f6ae67f3f0d0e33d13bf55a796442c1dcb0cc067) All Branches Search

10531 Commits (f6ae67f3f0d0e33d13bf55a796442c1dcb0cc067)

All Branches