cranelift

Commit Graph

Author	SHA1	Message	Date
Alex Crichton	a0e4bb0190	Prevent virtual memory OOM in spectest fuzzing (#4872 ) This commit hard-codes the pooling allocator's limit of linear memories to 1 when used with fuzzing the spec tests themselves. This prevents the number from being set too high and hitting a virtual-memory-based OOM due to the virtual memory reservation of the pooling allocator being too large.	2 years ago
Alex Crichton	543a487939	Throw out fewer fuzz inputs with differential fuzzer (#4859 ) * Throw out fewer fuzz inputs with differential fuzzer Prior to this commit the differential fuzzer would generate a module and then select an engine to execute the module against Wasmtime. This meant, however, that the candidate list of engines were filtered against the configuration used to generate the module to ensure that the selected engine could run the generated module. This commit inverts this logic and instead selects an engine first, allowing the engine to then tweak the module configuration to ensure that the generated module is compatible with the engine selected. This means that fewer fuzz inputs are discarded because every fuzz input will result in an engine being executed. Internally the engine constructors have all been updated to update the configuration to work instead of filtering the configuration. Some other fixes were applied for the spec interpreter as well to work around #4852 * Fix tests	2 years ago
Jamey Sharp	9856664f1f	Make DataValue, not Ieee32/64, respect IEEE754 (#4860 ) * cranelift-codegen: Remove all uses of DataValue This type is only used by the interpreter, cranelift-fuzzgen, and filetests. I haven't found another convenient crate for those to all depend on where this type can live instead, but this small refactor at least makes it obvious that code generation does not in any way depend on the implementation of this type. * Make DataValue, not Ieee32/64, respect IEEE754 This fixes #4857 by partially reverting #4849. It turns out that Ieee32 and Ieee64 need bitwise equality semantics so they can be used as hash-table keys. Moving the IEEE754 semantics up a layer to DataValue makes sense in conjunction with #4855, where we introduced a DataValue::bitwise_eq alternative implementation of equality for those cases where users of DataValue still want the bitwise equality semantics. * cranelift-interpreter: Use eq/ord from DataValue This fixes #4828, again, now that the comparison operators on DataValue have the right IEEE754 semantics. * Add regression test from issue #4857	2 years ago
Afonso Bordado	7e45cff459	cranelift: Bitwise compare fuzzgen results (#4855 )	2 years ago
Alex Crichton	10dbb19983	Various improvements to differential fuzzing (#4845 ) * Improve wasmi differential fuzzer * Support modules with a `start` function * Implement trap-matching to ensure that wasmi and Wasmtime both report the same flavor of trap. * Support differential fuzzing where no engines match Locally I was attempting to run against just one wasm engine with `ALLOWED_ENGINES=wasmi` but the fuzzer quickly panicked because the generated test case didn't match wasmi's configuration. This commit updates engine-selection in the differential fuzzer to return `None` if no engine is applicable, throwing out the test case. This won't be hit at all with oss-fuzz-based runs but for local runs it'll be useful to have. * Improve proposal support in differential fuzzer * De-prioritize unstable wasm proposals such as multi-memory and memory64 by making them more unlikely with `Unstructured::ratio`. * Allow fuzzing multi-table (reference types) and multi-memory by avoiding setting their maximums to 1 in `set_differential_config`. * Update selection of the pooling strategy to unconditionally support the selected module config rather than the other way around. * Improve handling of traps in differential fuzzing This commit fixes an issue found via local fuzzing where engines were reporting different results but the underlying reason for this was that one engine was hitting stack overflow before the other. To fix the underlying issue I updated the execution to check for stack overflow and, if hit, it discards the entire fuzz test case from then on. The rationale behind this is that each engine can have unique limits for stack overflow. One test case I was looking at for example would stack overflow at less than 1000 frames with epoch interruption enabled but would stack overflow at more than 1000 frames with it disabled. This means that the state after the trap started to diverge and it looked like the engines produced different results. While I was at it I also improved the "function call returned a trap" case to compare traps to make sure the same trap reason popped out. * Fix fuzzer tests	2 years ago
Afonso Bordado	3afb711a51	cranelift: Document Ieee{32,64} implementation (#4854 )	2 years ago
Alex Crichton	b8a68ff86d	Tweak adapter cost of lists (#4853 ) I noticed an oss-fuzz-based timeout that was reported for the `component_api` fuzzer where the adapter module generated takes 1.5 seconds to compile the singular function in release mode (no fuzzing enabled). The test case in question was a deeply recursive list-of-list-of-etc and only one function was generated instead of multiple. I updated the cost of strings/lists to cost more in the approximate cost calculation which now forces the one giant function to get split up and the large function is now split up into multiple smaller function that take milliseconds to compile.	2 years ago
Afonso Bordado	f30a7eb0c9	cranelift: Implement PartialEq on Ieee{32,64} (#4849 ) * cranelift: Add `fcmp` tests Some of these are disabled on aarch64 due to not being implemented yet. * cranelift: Implement float PartialEq for Ieee{32,64} (fixes #4828) Previously `PartialEq` was auto derived. This means that it was implemented in terms of PartialEq in a u32. This is not correct for floats because `NaN != NaN`. PartialOrd was manually implemented in `6d50099816`, but it seems like it was an oversight to leave PartialEq out until now. The test suite depends on the previous behaviour so we adjust it to keep comparing bits instead of floats. * cranelift: Disable `fcmp ord` tests on aarch64 * cranelift: Disable `fcmp ueq` tests on aarch64	2 years ago
Anton Kirilov	48bf078c83	Cranelift AArch64: Fix the atomic memory operations (#4831 ) Previously the implementations of the various atomic memory IR operations ignored the memory operation flags that were passed. Copyright (c) 2022, Arm Limited. Co-authored-by: Chris Fallin <chris@cfallin.org>	2 years ago
Anton Kirilov	d2e19b8d74	Cranelift AArch64: Migrate AMode to ISLE (#4832 ) Copyright (c) 2022, Arm Limited. Co-authored-by: Chris Fallin <chris@cfallin.org>	2 years ago
Chris Fallin	385bd0cbf8	x64: fix CvtFloatToUintSeq: do not clobber src. (#4842 ) This slipped through the regalloc2 operand code update in #4811: the CvtFloatToUintSeq pseudo-instruction actually clobbers its source. It was marked as a "mod" operand in the original and I mistakenly converted it to a "use" as I had not seen the actual clobber. The instruction now takes an extra temp and makes a copy of `src` in the appropriate place. Fixes #4840.	2 years ago
Afonso Bordado	08e7a7f1a0	cranelift: Add inline stack probing for x64 (#4747 ) * cranelift: Add inline stack probe for x64 * cranelift: Cleanups comments Thanks @jameysharp!	2 years ago
Jamey Sharp	84ac24c23d	cranelift: Remove const_addr instruction (fixes #2398 ) (#4843 )	2 years ago
Chris Fallin	ae5fe8a728	aarch64: fix up regalloc2 semantics. (#4830 ) This PR removes all uses of modify-operands in the aarch64 backend, replacing them with reused-input operands instead. This has the nice effect of removing a bunch of move instructions and more clearly representing inputs and outputs. This PR also removes the explicit use of pinned vregs in the aarch64 backend, instead using fixed-register constraints on the operands when insts or pseudo-inst sequences require certain registers. This is the second PR in the regalloc-semantics cleanup series; after the remaining backend (s390x) and the ABI code are cleaned up as well, we'll be able to simplify the regalloc2 frontend.	2 years ago
Andrew Brown	ac2d4c4818	x64: improve tests for `heap_addr` (#4841 ) * x64: improve tests for `heap_addr` This change adds Cranelift `compile` tests for the various cases for `heap_addr`. The idea behind this is to more clearly show what the penalties are for dynamically- vs statically-allocated memory as well as turning Spectre mitigations on and off. * Add test case: "right" size memory with Spectre enabled	2 years ago
Xuran	bca4dae8b0	feat: add a knob for reset stack (#4813 ) * feat: add a knob for reset stack * Touch up documentation of `async_stack_zeroing` Co-authored-by: Alex Crichton <alex@alexcrichton.com>	2 years ago
Nick Fitzgerald	c54d8384ee	Add some more audits for my own crates (#4837 ) Mostly stuff that Firefox is using and asked me to publish audits for, but a couple are in our dep tree as well.	2 years ago
Afonso Bordado	2beaf7352f	cranelift: Test calling across different calling conventions (#4801 ) * cranelift: Test calling across different calling conventions * cranelift: Use `wasmtime_system_v` calling convention for cross cc tests	2 years ago
Alex Crichton	328727644f	Add some audits for some low-hanging fruit (#4836 ) I looked through some of our smaller dependencies to vet them and add an audit for them. These were the ones that were all "obviously correct" to me before I ran out of steam reviewing other crates.	2 years ago
Trevor Elliott	dde2c5a3b6	Align functions according to their ISA's requirements (#4826 ) Add a function_alignment function to the TargetIsa trait, and use it to align functions when generating objects. Additionally, collect the maximum alignment required for pc-relative constants in functions and pass that value out. Use the max of these two values when padding functions for alignment. This fixes a bug on x86_64 where rip-relative loads to sse registers could cause a segfault, as functions weren't always guaranteed to be aligned to 16-byte addresses. Fixes #4812	2 years ago
Nick Fitzgerald	f18a1f1488	Cranelift: Deduplicate ABI signatures during lowering (#4829 ) * Cranelift: Deduplicate ABI signatures during lowering This commit creates the `SigSet` type which interns and deduplicates the ABI signatures that we create from `ir::Signature`s. The ABI signatures are now referred to indirectly via a `Sig` (which is a `cranelift_entity` ID), and we pass around a `SigSet` to anything that needs to access the actual underlying `SigData` (which is what `ABISig` used to be). I had to change a couple methods to return a `SmallInstVec` instead of emitting directly to work around what would otherwise be shared and exclusive borrows of the lowering context overlapping. I don't expect any of these to heap allocate in practice. This does not remove the often-unnecessary allocations caused by `ensure_struct_return_ptr_is_returned`. That is left for follow up work. This also opens the door for further shuffling of signature data into more efficient representations in the future, now that we have `SigSet` to store it all in one place and it is threaded through all the code. We could potentially move each signature's parameter and return vectors into one big vector shared between all signatures, for example, which could cut down on allocations and shrink the size of `SigData` since those `SmallVec`s have pretty large inline capacity. Overall, this refactoring gives a 1-7% speedup for compilation on `pulldown-cmark`: ``` compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 8754213.66 ± 7526266.23 (confidence = 99%) dedupe.so is 1.01x to 1.07x faster than main.so! [191003295 234620642.20 280597986] dedupe.so [197626699 243374855.86 321816763] main.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm No difference in performance. [170406200 194299792.68 253001201] dedupe.so [172071888 193230743.11 223608329] main.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm No difference in performance. [3870997347 4437735062.59 5216007266] dedupe.so [4019924063 4424595349.24 4965088931] main.so ``` * Use full path instead of import to avoid warnings in some build configurations Warnings will then cause CI to fail. * Move `SigSet` into `VCode`	2 years ago
Alex Crichton	62c5af68b5	components: Limit the recursive size of types in Wasmtime (#4825 ) * components: Limit the recursive size of types in Wasmtime This commit is aimed at fixing #4814 by placing a hard limit on the maximal recursive depth a type may have in the component model. The component model theoretically allows for infinite recursion but many various types of operations within the component model are naturally written as recursion over the structure of a type which can lead to stack overflow with deeply recursive types. Some examples of recursive operations are: * Lifting and lowering a type - currently the recursion here is modeled in Rust directly with `#[derive]` implementations as well as the implementations for the `Val` type. * Compilation of adapter trampolines which iterates over the type structure recursively. * Historically many various calculations like the size of a type, the flattened representation of a type, etc, were all done recursively. Many of these are more efficiently done via other means but it was still natural to implement these recursively initially. By placing a hard limit on type recursion Wasmtime won't be able to load some otherwise-valid modules. The hope, though, is that no human-written program is likely to ever reach this limit. This limit can be revised and/or the locations with recursion revised if it's ever reached. The implementation of this feature is done by generalizing the current flattened-representation calculation which now keeps track of a type's depth and size. The size calculation isn't used just yet but I plan to use it in fixing #4816 and it was natural enough to write here as well. The depth is checked after a type is translated and if it exceeds the maximum then an error is returned. Additionally the `Arbitrary for Type` implementation was updated to prevent generation of a type that's too-recursive. Closes #4814 * Remove unused size calculation * Bump up just under the limit	2 years ago
Alex Crichton	99c6d7c083	components: Improve heuristic for splitting adapters (#4827 ) This commit is a (second?) attempt at improving the generation of adapter modules to avoid excessively large functions for fuzz-generated inputs. The first iteration of adapters simply translated an entire type inline per-function. This proved problematic however since the size of the adapter function was on the order of the overall size of a type, which can be exponential for a type that is otherwise defined in linear size. The second iteration of adapters performed a split where memory-based types would always be translated with individual functions. The theory here was that once a type was memory-based it was large enough to not warrant inline translation in the original function and a separate outlined function could be shared and otherwise used to deduplicate portions of the original giant function. This again proved problematic, however, since the splitting heuristic was quite naive and didn't take into account large stack-based types. This third iteration in this commit replaces the previous system with a similar but slightly more general one. Each adapter function now has a concept of fuel which is decremented each time a layer of a type is translated. When fuel runs out further translations are deferred to outlined functions. The fuel counter should hopefully provide a sort of reasonable upper bound on the size of a function and the outlined functions should ideally provide the ability to be called from multiple places and therefore deduplicate what would otherwise be a massive function. This final iteration is another attempt at guaranteeing that an adapter module is linear in size with respect to the input type section of the original module. Additionally this iteration uniformly handles stack and memory-based translations which means that stack-based translations can't go wild in their function size and memory-based translations may benefit slightly from having at least a little bit of inlining internally. The immediate impact of this is that the `component_api` fuzzer seems to be running at a faster rate than before. Otherwise #4825 is sufficient to invalidate preexisting fuzz-bugs and this PR is hopefully the final nail in the coffin to prevent further timeouts for small inputs cropping up. Closes #4816	2 years ago
Trevor Elliott	fb8b9838fe	Add MInst.XmmUnaryRmRImm to handle rounding instructions (#4823 ) Add a new pseudo-instruction, XmmUnaryRmRImm, to handle instructions like roundss that only use their first register argument for the instruction's result. This has the added benefit of allowing the isle wrappers for those instructions to take an XmmMem argument, allowing for more cases where loads may be merged.	2 years ago
Afonso Bordado	cf7cb10036	cranelift: Add some filetests documentation (#4833 )	2 years ago
Chris Fallin	186c7c3b89	x64: clean up regalloc-related semantics on several instructions. (#4811 ) * x64: clean up regalloc-related semantics on several instructions. This PR removes all uses of "modify" operands on instructions in the x64 backend, and also removes all uses of "pinned vregs", or vregs that are explicitly tied to particular physical registers. In place of both of these mechanisms, which are legacies of the old regalloc design and supported via compatibility code, the backend now uses operand constraints. This is more flexible as it allows the regalloc to see the liveranges and constraints without "reverse-engineering" move instructions. Eventually, after removing all such uses (including in other backends and by the ABI code), we can remove the compatibility code in regalloc2, significantly simplifying its liverange-construction frontend and thus allowing for higher confidence in correctness as well as possibly a bit more compilation speed. Curiously, there are a few extra move instructions now; they are likely poor splitting decisions and I can try to chase these down later. * Fix cranelift-codegen tests. * Review feedback.	2 years ago
Afonso Bordado	3ce3eeb668	cranelift: Register all functions in test file for interpreter (#4817 ) * cranelift: Implement `bnot` in interpreter * cranelift: Register all functions in test file for interpreter * cranelift: Relax signature checking for bools and vectors	2 years ago
Trevor Elliott	da0d8781b5	Add a template for fuzz bugs (#4808 ) Add a template for fuzz bugs Co-authored-by: Jamey Sharp <jsharp@fastly.com>	2 years ago
Nick Fitzgerald	ff0e84ecf4	Wasmtime: fix stack walking across frames from different stores (#4779 ) We were previously implicitly assuming that all Wasm frames in a stack used the same `VMRuntimeLimits` as the previous frame we walked, but this is not true when Wasm in store A calls into the host which then calls into Wasm in store B: \| ... \| \| Host \| \| +-----------------+ \| stack \| Wasm in store A \| \| grows +-----------------+ \| down \| Host \| \| +-----------------+ \| \| Wasm in store B \| V +-----------------+ Trying to walk this stack would previously result in a runtime panic. The solution is to push the maintenance of our list of saved Wasm FP/SP/PC registers that allow us to identify contiguous regions of Wasm frames on the stack deeper into `CallThreadState`. The saved registers list is now maintained whenever updating the `CallThreadState` linked list by making the `CallThreadState::prev` field private and only accessible via a getter and setter, where the setter always maintains our invariants.	2 years ago
Alex Crichton	09c93c70cc	Remove the `ansi_term` transitive dependency (#4822 ) Only used during tests but this resolves #4742 by slimming the dependency tree.	2 years ago
Chris Fallin	1a59b3e6c6	AArch64: port `tls_value` to ISLE. (#4821 )	2 years ago
Trevor Elliott	b033aba61b	Move the nop lowering to ISLE, and remove the final return from lower.rs (#4809 ) Lower nop in ISLE in the x64 backend, and remove the final Ok(()) from the lower function to assert that all cases that aren't handled in ISLE will panic.	2 years ago
Damian Heaton	3d9d759380	Port `fcmp` to ISLE (AArch64) (#4819 ) Ported the existing implementation of `fcmp` for AArch64 to ISLE. This also ports the `lower_vector_comparison` method to ISLE. Copyright (c) 2022 Arm Limited	2 years ago
TheGreatRambler	e910b8fbfb	Android support (#4606 ) * Add android aarch64 support into c-api * Remove target test and clean up CMake script c-api * Deduplicate ExternalProject_Add in c-api Android support	2 years ago
Bobby Holley	52d88facdd	Import cargo-vet audits from Mozilla (#4792 ) * Bump cargo-vet to 0.3. * Add Mozilla as a trusted import for audits.	2 years ago
Chris Fallin	b1fb4d7c35	Fix lowering issue in x64 vany_true: sinking and using original value. (#4815 ) The x64 lowring of `vany_true` both sinks mergeable loads and uses the original register. This PR fixes the lowering to force the value into a register first. Ideally we should solve the issue by catching this in the ISLE type system, as described in #4745, but this resolves the issue for now. Fixes #4807.	2 years ago
Chris Fallin	2b4b257834	Revert "cranelift: Register all functions in test file for interpreter (#4800 )" (#4810 ) This reverts commit `500a9f17be`.	2 years ago
Chris Fallin	955d4e4ba1	AArch64: port load and store operations to ISLE. (#4785 ) This retains `lower_amode` in the handwritten code (@akirilov-arm reports that there is an upcoming patch to port this), but tweaks it slightly to take a `Value` rather than an `Inst`.	2 years ago
Trevor Elliott	5d05d7676f	Improve the `fmt` output of the instantiate fuzz target (#4804 ) Add an Arbitrary instance for the input to the instantiate fuzz target, so that cargo fuzz fmt instantiate <file> produces more meaningful output.	2 years ago
Afonso Bordado	500a9f17be	cranelift: Register all functions in test file for interpreter (#4800 ) * cranelift: Implement `bnot` in interpreter * cranelift: Register all functions in test file for interpreter	2 years ago
Nick Fitzgerald	5392d7cdd7	cranelift: Merge `abi` and `abi_impl` modules (#4805 )	2 years ago
Jamey Sharp	dd81e5a64f	Don't let fuzz targets import `arbitrary` directly (#4806 ) The version of the `arbitrary` crate used in fuzz targets needs to be the same as the version used in `libfuzzer-sys`. That's why the latter crate re-exports the former. But we need to make sure to consistently use the re-exported version. That's most easily done if that's the only version we have available. However, `fuzz/Cargo.toml` declared a direct dependency on `arbitrary`, making it available for import, and leading to that version being used in a couple places. There were two copies of `arbitrary` built before, even though they were the same version: one with the `derive` feature turned on, through the direct dependency, and one with it turned off when imported through `libfuzzer-sys`. So I haven't specifically tested this but fuzzer builds might be slightly faster now. I have not removed the build-dep on `arbitrary`, because `build.rs` is not invoked by libFuzzer and so it doesn't matter what version of `arbitrary` it uses. Our other crates, like `cranelift-fuzzgen` and `wasmtime-fuzzing`, can still accidentally use a different version of `arbitrary` than the fuzz targets which rely on them. This commit only fixes the direct cases within `fuzz/**`.	2 years ago
Jamey Sharp	4882347868	Disable funcref generation for fuzz tests with inputs (#4797 ) This fixes #4757, fixes #4758, and fixes new fuzzbugs that are probably coming after we merged #4667.	2 years ago
Afonso Bordado	07767c3d4a	cranelift: Enable i128 shifts (#4783 )	2 years ago
Afonso Bordado	7663cc1c3d	cranelift: Disable i128 divs on fuzzgen (#4771 )	2 years ago
Afonso Bordado	9a8bd5be02	cranelift: Add LibCalls to the interpreter (#4782 ) * cranelift: Add libcall handlers to interpreter * cranelift: Fuzz IshlI64 libcall * cranelift: Revert back to fuzzing udivi64 * cranelift: Use sdiv as a fuzz libcall * cranelift: Register Sdiv in fuzzgen * cranelift: Add multiple libcalls to fuzzer * cranelift: Register a single libcall handler * cranelift: Simplify args checking in interpreter * cranelift: Remove unused LibCalls * cranelift: Cleanup interpreter libcall types * cranelift: Fix Interpreter Docs	2 years ago
Chris Fallin	a6eb24bd4f	AArch64: port misc ops to ISLE. (#4796 ) * Add some precise-output compile tests for aarch64. * AArch64: port misc ops to ISLE. - get_pinned_reg / set_pinned_reg - bitcast - stack_addr - extractlane - insertlane - vhigh_bits - iadd_ifcout - fcvt_low_from_sint	2 years ago
Johnnie Birch	6368c6b188	Modifies fcvt_to_sint and fcvt_to_unit clif to make scalar only (#4794 ) Closes #4693.	2 years ago
Jamey Sharp	573ae0c60b	cranelift-fuzzgen: use a different namespace (#4795 ) Otherwise I get a panic with "Duplicate function with name u0:1 found!" at fuzz/fuzz_targets/cranelift-fuzzgen.rs:76:10.	2 years ago
Trevor Elliott	25d960f9c4	x64: Lower tlsvalue, sqmul_round_sat, and uunarrow in ISLE (#4793 ) Lower tlsvalue, sqmul_round_sat, and uunarrow in ISLE.	2 years ago

1 2 3 4 5 ...

10380 Commits (release-2.0.0) All Branches Search

10380 Commits (release-2.0.0)

All Branches