cranelift

Commit Graph

Author	SHA1	Message	Date
Alex Crichton	434e35c490	Panic on resetting image slots back to anonymous memory (#3841 ) * Panic on resetting image slots back to anonymous memory This commit updates `Drop for MemoryImageSlot` to panic instead of ignoring errors when resetting memory back to a clean slate. On reading some of this code again for a different change I realized that if an error happens in `reset_with_anon_memory` it would be possible, depending on where another error happened, to leak memory from one image to another. For example if `clear_and_remain_ready` failed its `madvise` (for whatever reason) and didn't actually reset any memory, then if `Drop for MemoryImageSlot` also hit an error trying to remap memory (for whatever reason), then nothing about memory has changed and when the `MemoryImageSlot` is recreated it'll think that it's 0-length when actually it's a bit larger and may leak data. I don't think this is a serious problem since we don't know any situation under which the `madvise` would fail and/or the resetting with anonymous memory, but given that these aren't expected to fail I figure it's best to be a bit more defensive here and/or loud about failures. * Update a comment	3 years ago
Sam Parker	5b7df72bce	[AArch64] Merge 32- and 64-bit BitOps (#3840 ) Copyright (c) 2022, Arm Limited.	3 years ago
Sam Parker	d307a4ab9a	[AArch64] Improve AtomicRMWLoop (#3839 ) Add more tests, use accurate disassembly, respect data sizes and simplify the Xchg implementation. Copyright (c) 2022, Arm Limited	3 years ago
bjorn3	141af7523a	Allow cloning DataDescriptor (#3377 )	3 years ago
Alex Crichton	01e567ca05	Downgrade a cpu feature log message (#3842 ) It looks like `error!` is printed by default as it's showing up in oss-fuzz logs, so downgrade this to `warn!` to avoid printing while fuzzing.	3 years ago
Andrew Brown	f87c61176a	x64: port select to ISLE (#3682 ) * x64: port `select` using an FP comparison to ISLE This change includes quite a few interlocking parts, required mainly by the current x64 conventions in ISLE: - it adds a way to emit a `cmove` with multiple OR-ing conditions; because x64 ISLE cannot currently safely emit a comparison followed by several jumps, this adds `MachInst::CmoveOr` and `MachInst::XmmCmoveOr` macro instructions. Unfortunately, these macro instructions hide the multi-instruction sequence in `lower.isle` - to properly keep track of what instructions consume and produce flags, @cfallin added a way to pass around variants of `ConsumesFlags` and `ProducesFlags`--these changes affect all backends - then, to lower the `fcmp + select` CLIF, this change adds several `cmove*_from_values` helpers that perform all of the awkward conversions between `Value`, `ValueReg`, `Reg`, and `Gpr/Xmm`; one upside is that now these lowerings have much-improved documentation explaining why the various `FloatCC` and `CC` choices are made the the way they are. Co-authored-by: Chris Fallin <chris@cfallin.org>	3 years ago
Andrew Brown	5a5e401a9c	doc: fix typo (#3838 )	3 years ago
Alex Crichton	bbd4a4a500	Enable copy-on-write heap initialization by default (#3825 ) * Enable copy-on-write heap initialization by default This commit enables the `Config::memfd` feature by default now that it's been fuzzed for a few weeks on oss-fuzz, and will continue to be fuzzed leading up to the next release of Wasmtime in early March. The documentation of the `Config` option has been updated as well as adding a CLI flag to disable the feature. * Remove ubiquitous "memfd" terminology Switch instead to forms of "memory image" or "cow" or some combination thereof. * Update new option names	3 years ago
Alex Crichton	593f8d96aa	Update wasm-{smith,encoder} (#3835 ) Ended up being a routine update but seemed good to go ahead and hook up updates. While I was at it I went ahead and hooked up multi-value swarm fuzzing as well now that wasm-smith implements it.	3 years ago
Alex Crichton	76a90d082a	Only queue up one triage task at a time on CI (#3834 ) Triage is scheduled to run once every 5 minutes but it's often queued up during the day as builders are otherwise occupied with actual CI builds. This can end up in a lot of triage tasks queued up back-to-back. While this doesn't seem to be a huge issue one thing I suspect is that this is perhaps somewhat related to API rate limits getting hit when recent versions were published. In any case there's no need for each and every triage run to do something, it's fine to only have one at a time pending.	3 years ago
Alex Crichton	709f7e0c8a	Enable SSE 4.2 unconditionally (#3833 ) * Enable SSE 4.2 unconditionally Fuzzing over the weekend found that `i64x2` comparison operators require `pcmpgtq` which is an SSE 4.2 instruction. Along the lines of #3816 this commit unconditionally enables and requires SSE 4.2 for compilation and fuzzing. It will no longer be possible to create a compiler for x86_64 with simd enabled if SSE 4.2 is disabled. * Update comment	3 years ago
Chris Fallin	43d31c5bf7	memfd: make "dense image" heuristic limit configurable. (#3831 ) In #3820 we see an issue with the new heuristics that control use of memfd: it's entirely possible for a reasonable Wasm module produced by a snapshotting system to have a relatively sparse heap (less than 50% filled). A system that avoids memfd because of this would have an undesirable performance reduction on such modules. Ultimately we should try to implement a hybrid scheme where we support outlier/leftover initializers, but for now this PR makes the "always allow dense" limit configurable. This way, embedders that want to ensure that memfd is used can do so, if they have other knowledge about the maximum heap size allowed in their system. (Partially addresses #3820 but let's leave it open to track the hybrid idea)	3 years ago
bjorn3	4ed353a7e1	Extract jit_int.rs and most of jitdump_linux.rs for use outside of wasmtime (#2744 ) * Extract gdb jit_int into wasmtime-jit-debug * Move a big chunk of the jitdump code to wasmtime-jit-debug * Fix doc markdown in perf_jitdump.rs	3 years ago
Alex Crichton	2616c28957	Allow failures when uploading release artifacts (#3832 ) Looks like the 0.34.1 release is missing artifacts and some jobs building artifacts ended up being cancelled because of API rate limits being hit on the builders. Artifacts are uploaded to the job, however, which means we can always go back and grab them to upload them, unless the whole job was cancelled. For 0.34.1 it looks like the Linux builder hit an error but its error then subsequently cancelled the Windows builders, so we don't actually have artifacts for Windows for the 0.34.1 release. This will hopefully prevent this from causing further issues in the future where if one builder hits an error while uploading artifacts the others will continue and we can manually upload what's missing if necessary. cc #3812	3 years ago
Andrew Brown	c183e93b80	x64: enable VTune support by default (#3821 ) * x64: enable VTune support by default After significant work in the `ittapi-rs` crate, this dependency should build without issue on Wasmtime's supported operating systems: Windows, Linux, and macOS. The difference in the release binary is <20KB, so this change makes `vtune` a default build feature. This change upgrades `ittapi-rs` to v0.2.0 and updates the documentation. * review: add configuration for defaults in more places * review: remove OS conditional compilation, add architecture * review: do not default vtune feature in wasmtime-jit	3 years ago
bjorn3	bbd52772de	Make VMOffset calculation more readable (#3793 ) * Fix typo * Move vmoffset field size and field name together The previous code was quite confusing about what applied to which field. The new code also makes it easier to move fields around and insert and delete fields. * Move builtin_functions before all variable sized fields This allows the offset to be calculated at compile time * Add cadd and cmul convenience functions * Remove comment * Change fields! syntax as per review * Add implicit u32::from to fields!	3 years ago
Peter Huene	084452acab	Fix max memory pages for spectests fuzz target. (#3829 ) This commit fixes the spectests fuzz target to set a lower bound on the arbitrary pooling allocator configurations of 10 memory pages so that the limit doesn't interfere with what's required in the spec tests.	3 years ago
bjorn3	2ca01ae947	Add a way to define a symbol lookup fn for the JIT (#2745 ) * Couple of cranelift-jit cleanups * Add a way to define a symbol lookup fn for the JIT	3 years ago
Kyle Brown	5ff1ddee5b	Mention --invoke on "CLI Options for `wasmtime`" page (#3828 ) * Document the invoke argument of the run command. * Update docs/cli-options.md Co-authored-by: Kyle Brown <kyleb@liquidrocketry.com> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	3 years ago
Chris Fallin	8ab07fe51a	Add Wasmtime meeting minutes for 2022-02-17. (#3826 ) Also move the verified fuzzing interpreter agenda item to today's agenda from Mar 17, since it actually was discussed today.	3 years ago
Alex Crichton	f425eb7ea5	Limit total memory usage in instantiate-many fuzzer (#3823 ) Per-`Store` allocations are already limited with the `StoreLimits` structure while fuzzing to ensure fuzz targets don't allocate more than 1GB of memory, but the `instantiate-many` fuzzer created many separate stores which each had their own limit, meaning that the 2GB limit of fuzzing could be pretty easily reached. This commit fixes the issue by making `StoreLimits` a shareable type via `Rc` to ensure the same limits can be applied to all stores created within a fuzz run, globally limiting the memory even across stores to 1GB.	3 years ago
Alex Crichton	37b0fd482d	Improve platform compatibility of fuzz test cases (#3824 ) In #3800 I added support to consume fuzz input as selection of whether or not target features should be enabled. This was done in a platform-specific manner, however, which means that I can no longer reliably take the fuzz reproducer cases from oss-fuzz and reproduce them locally on an aarch64 machine. This commit fixes this problem by unconditionally pulling bytes from the input for fuzz features, irrespective of the host platform. Features are then discarded if they're not applicable.	3 years ago
Sam Parker	e572198f85	[AArch64] Merge 32- and 64-bit ALUOps (#3802 ) Combine the two opcodes into one and pass and add an OperandSize field to these instructions, as well as an ISLE helper to perform the conversion from Type. This saves us from having having to write ISLE helpers to select the correct opcode, based on type, and reduces the amount of code needed for emission. Copyright (c) 2022, Arm Limited.	3 years ago
Alex Crichton	b62fe21914	Update memfd image construction to avoid excessively large images (#3819 ) * Update memfd image construction to avoid excessively large images Previously memfd-based image construction had a hard limit of a 1GB memory image but this mean that tiny wasm modules could allocate up to 1GB of memory which became a bit excessive especially in terms of memory usage during fuzzing. To fix this the conversion to a static memory image has been updated to first do a conversion to paged memory initialization, which is sparse, followed by a second conversion to static memory initialization. The sparse construction for the paged step should make it such that the upper/lower bounds of the initialization image are easily computed, and then afterwards this limit can be checked against some heuristics to determine if we're willing to commit to building up a whole static image for that module. The heuristics have been tweaked from "must be less than 1GB" to one of two conditions must be true: * Either the total memory image size is at most twice the size of the original paged data itself. * Otherwise the memory image size must be smaller than a reasonable threshold, currently 1MB. We'll likely need to tweak this over time and it's still possible to cause a lot of extra memory consumption, but for now this should be enough to appease the fuzzers. Closes #3815 * Review comments	3 years ago
Corey Farwell	9c3d5c7b9f	Remove broken Markdown link (#3822 )	3 years ago
Chris Fallin	1c014d129a	Cranelift: ensure ISA level needed for SIMD is present when SIMD is enabled. (#3816 ) Addresses #3809: when we are asked to create a Cranelift backend with shared flags that indicate support for SIMD, we should check that the ISA level needed for our SIMD lowerings is present.	3 years ago
Peter Huene	ef17a36852	Port fix for `CVE-2022-23636` to `main`. (#3818 ) * Port fix for `CVE-2022-23636` to `main`. This commit ports the fix for `CVE-2022-23636` to `main`, but performs a refactoring that makes it unnecessary for the instance itself to track if it has been initialized; such a change was not targeted enough for a security patch. The pooling allocator will now only initialize an instance if all of its associated resource creation succeeds. If the resource creation fails, no instance is dropped as none was initialized. Also updates `RELEASES.md` to include the related patch releases. * Add `Instance::new_at` to fully initialize an instance. Added `Instance::new_at` to fully initialize an instance at a given address. This will hopefully prevent the possibility that an `Instance` structure doesn't have an initialized `VMContext` when it is dropped.	3 years ago
Chris Fallin	96e32e98f8	Cancel Cranelift meeting on 2022-02-21 (due to US holiday). (#3817 )	3 years ago
Alex Crichton	498c592b19	Unconditionally enable sse3, ssse3, and sse4.1 when fuzzing (#3814 ) * Unconditionally enable sse3, ssse3, and sse4.1 when fuzzing This commit unconditionally enables some x86_64 instructions when fuzzing because the cranelift backend is known to not work if these features are disabled. From discussion on the wasm simd proposal the assumed general baseline for running simd code is SSE4.1 anyway. At this time I haven't added any sort of checks in Wasmtime itself. Wasmtime by default uses the native architecture and when explicitly enabling features this still needs to be explicitly specified. Closes #3809 * Update crates/fuzzing/src/generators.rs Co-authored-by: Andrew Brown <andrew.brown@intel.com> Co-authored-by: Andrew Brown <andrew.brown@intel.com>	3 years ago
Nick Fitzgerald	db9e3ce9d9	CI: fix workflow syntax for PR/issues triage workflow (#3806 )	3 years ago
Nick Fitzgerald	57618f746a	Add messages for config changes (#3803 ) * Automatically label Wasmtime `Config` object changes * Add the "Label Messager" github action This allows us to have custom messages left in comments on issues and pull requests when they get labeled with a specific label. * Add a message for `wasmtime:config`-labeled pull requests * CI: Consolidate issue/PR triage workflows	3 years ago
Peter Huene	6ffcd4ead9	Improve stability for fuzz targets. (#3804 ) This commit improves the stability of the fuzz targets by ensuring the generated configs and modules are congruent, especially when the pooling allocator is being used. For the `differential` target, this means both configurations must use the same allocation strategy for now as one side generates the module that might not be compatible with another arbitrary config now that we fuzz the pooling allocator. These changes also ensure that constraints put on the config are more consistently applied, especially when using a fuel-based timeout.	3 years ago
Alex Crichton	0b4263333b	Fuzz cranelift cpu flag settings with Wasmtime (#3800 ) * Fuzz cranelift cpu flag settings with Wasmtime This commit updates the `Config` fuzz-generator to consume some of the input as configuration settings for codegen flags we pass to cranelift. This should allow for ideally some more coverage where settings are disabled or enabled, ideally finding possible bugs in feature-specific implementations or generic implementations that are rarely used if the feature-specific ones almost always take precedent. The technique used in this commit is to weight selection of codegen settings less frequently than using the native settings. Afterwards each listed feature is individually enabled or disabled depending on the input fuzz data, and if a feature is enabled but the host doesn't actually support it then the fuzz input is rejected with a log message. The goal here is to still have many fuzz inputs accepted but also ensure determinism across hosts. If there's a bug specifically related to enabling a flag then running it on a host without the flag should indicate that the flag isn't supported rather than silently leaving it disabled and reporting the fuzz case a success. * Use built-in `Unstructured::ratio` method * Tweak macro * Bump arbitrary dep version	3 years ago
Cameron Harris	85cf4b042a	Added 'add_fuel' command line option (#3792 ) * Added 'add_fuel' command line option * Added default value to 'add_fuel' config option * Added 'add_fuel' to run command Store instantiation * Added comment * Added warning for add-fuel without consume-fuel * Formatting * Changed out --add-fuel and --consume-fuel to --fuel * Formatting * Update src/lib.rs * Update src/commands/run.rs Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com>	3 years ago
Chris Fallin	ca0e8d0a1d	Remove incomplete/unmaintained ARM32 backend (for now). (#3799 ) In #3721, we have been discussing what to do about the ARM32 backend in Cranelift. Currently, this backend supports only 32-bit types, which is insufficient for full Wasm-MVP; it's missing other critical bits, like floating-point support; and it has only ever been exercised, AFAIK, via the filetests for the individual CLIF instructions that are implemented. We were very very thankful for the original contribution of this backend, even in its partial state, and we had hoped at the time that we could eventually mature it in-tree until it supported e.g. Wasm and other use-cases. But that hasn't yet happened -- to the blame of no-one, to be clear, we just haven't had a contributor with sufficient time. Unfortunately, the existence of the backend and lack of active maintainer now potentially pose a bit of a burden as we hope to make continuing changes to the backend framework. For example, the ISLE migration, and the use of regalloc2 that it will allow, would need all of the existing lowering patterns in the hand-written ARM32 backend to be rewritten as ISLE rules. Given that we don't currently have the resources to do this, we think it's probably best if we, sadly, for now remove this partial backend. This is not in any way a statement of what we might accept in the future, though. If, in the future, an ARM32 backend updated to our latest codebase with an active maintainer were to appear, we'd be happy to merge it (and likewise for any other architecture!). But for now, this is probably the best path. Thanks again to the original contributor @jmkrauz and we hope that this work can eventually be brought back and reused if someone has the time to do so!	3 years ago
Nick Fitzgerald	dc86e7a6dc	cranelift: Use GPR newtypes extensively in x64 lowering (#3798 ) We already defined the `Gpr` newtype and used it in a few places, and we already defined the `Xmm` newtype and used it extensively. This finishes the transition to using the newtypes extensively in lowering by making use of `Gpr` in more places. Fixes #3685	3 years ago
Mrmaxmeier	84b9c7bb8a	cranelift/x64: lower min and max for <= `i64` (#3748 ) * cranelift/x64: lower min and max for <= `i64` * cranelift: add runtests for integer min/max	3 years ago
Peter Huene	da539255a5	Use a much lower memory page limit for pooling allocator fuzzing. (#3795 ) This commit makes it such that the pooling allocator will be configured with a much lower upper bound for memory pages, which will greatly reduce the likelihood that the fuzzer memory limits will be hit from having too many memories from too many instances committed.	3 years ago
Conrad Watt	db2fec46bd	Agenda item 03-17 (#3797 )	3 years ago
wackbyte	05ace6c0e2	Fix a typo in `cranelift-frontend`'s docs (#3796 ) Specifically that of `Variable`.	3 years ago
Alex Crichton	b438617e12	Further minor optimizations to instantiation (#3791 ) * Shrink the size of `FuncData` Before this commit on a 64-bit system the `FuncData` type had a size of 88 bytes and after this commit it has a size of 32 bytes. A `FuncData` is required for all host functions in a store, including those inserted from a `Linker` into a store used during linking. This means that instantiation ends up creating a nontrivial number of these types and pushing them into the store. Looking at some profiles there were some surprisingly expensive movements of `FuncData` from the stack to a vector for moves-by-value generated by Rust. Shrinking this type enables more efficient code to be generated and additionally means less storage is needed in a store's function array. For instantiating the spidermonkey and rustpython modules this improves instantiation by 10% since they each import a fair number of host functions and the speedup here is relative to the number of items imported. * Use `ptr::copy_nonoverlapping` during initialization Prevoiusly `ptr::copy` was used for copying imports into place which translates to `memmove`, but `ptr::copy_nonoverlapping` can be used here since it's statically known these areas don't overlap. While this doesn't end up having a performance difference it's something I kept noticing while looking at the disassembly of `initialize_vmcontext` so I figured I'd go ahead and implement. * Indirect shared signature ids in the VMContext This commit is a small improvement for the instantiation time of modules by avoiding copying a list of `VMSharedSignatureIndex` entries into each `VMContext`, instead building one inside of a module and sharing that amongst all instances. This involves less lookups at instantiation time and less movement of data during instantiation. The downside is that type-checks on `call_indirect` now involve an additionally load, but I'm assuming that these are somewhat pessimized enough as-is that the runtime impact won't be much there. For instantiation performance this is a 5-10% win with rustpyhon/spidermonky instantiation. This should also reduce the size of each `VMContext` for an instantiation since signatures are no longer stored inline but shared amongst all instances with one module. Note that one subtle change here is that the array of `VMSharedSignatureIndex` was previously indexed by `TypeIndex`, and now it's indexed by `SignaturedIndex` which is a deduplicated form of `TypeIndex`. This is done because we already had a list of those lying around in `Module`, so it was easier to reuse that than to build a separate array and store it somewhere. * Reserve space in `Store<T>` with `InstancePre` This commit updates the instantiation process to reserve space in a `Store<T>` for the functions that an `InstancePre<T>`, as part of instantiation, will insert into it. Using an `InstancePre<T>` to instantiate allows pre-computing the number of host functions that will be inserted into a store, and by pre-reserving space we can avoid costly reallocations during instantiation by ensuring the function vector has enough space to fit everything during the instantiation process. Overall this makes instantiation of rustpython/spidermonkey about 8% faster locally. * Fix tests * Use checked arithmetic	3 years ago
Alex Crichton	c0c368d151	Use mmap'd `.cwasm` as a source for memory initialization images (#3787 ) Skip memfd creation with precompiled modules This commit updates the memfd support internally to not actually use a memfd if a compiled module originally came from disk via the `wasmtime::Module::deserialize_file` API. In this situation we already have a file descriptor open and there's no need to copy a module's heap image to a new file descriptor. To facilitate a new source of `mmap` the currently-memfd-specific-logic of creating a heap image is generalized to a new form of `MemoryInitialization` which is attempted for all modules at module-compile-time. This means that the serialized artifact to disk will have the memory image in its entirety waiting for us. Furthermore the memory image is ensured to be padded and aligned carefully to the target system's page size, notably meaning that the data section in the final object file is page-aligned and the size of the data section is also page aligned. This means that when a precompiled module is mapped from disk we can reuse the underlying `File` to mmap all initial memory images. This means that the offset-within-the-memory-mapped-file can differ for memfd-vs-not, but that's just another piece of state to track in the memfd implementation. In the limit this waters down the term "memfd" for this technique of quickly initializing memory because we no longer use memfd unconditionally (only when the backing file isn't available). This does however open up an avenue in the future to porting this support to other OSes because while `memfd_create` is Linux-specific both macOS and Windows support mapping a file with copy-on-write. This porting isn't done in this PR and is left for a future refactoring. Closes #3758 * Enable "memfd" support on all unix systems Cordon off the Linux-specific bits and enable the memfd support to compile and run on platforms like macOS which have a Linux-like `mmap`. This only works if a module is mapped from a precompiled module file on disk, but that's better than not supporting it at all! * Fix linux compile * Use `Arc<File>` instead of `MmapVecFileBacking` * Use a named struct instead of mysterious tuples * Comment about unsafety in `Module::deserialize_file` * Fix tests * Fix uffd compile * Always align data segments No need to have conditional alignment since their sizes are all aligned anyway * Update comment in build.rs * Use rustix, not `region` * Fix some confusing logic/names around memory indexes These functions all work with memory indexes, not specifically defined memory indexes.	3 years ago
Alex Crichton	1cb08d4e67	Minor instantiation benchmark updates (#3790 ) This commit has a few minor updates and some improvements to the instantiation benchmark harness: * A `once_cell::unsync::Lazy` type is now used to guard creation of modules/engines/etc. This enables running singular benchmarks to be much faster since the benchmark no longer compiles all other benchmarks that are filtered out. Unfortunately I couldn't find a way in criterion to test whether a `BenchmarkId` is filtered out or not so we rely on the runtime laziness to initialize on the first run for benchmarks that do so. * All files located in `benches/instantiation` are now loaded for benchmarking instead of a hardcoded list. This makes it a bit easier to throw files into the directory and have them benchmarked instead of having to recompile when working with new files. * Finally a module deserialization benchmark was added to measure the time it takes to deserialize a precompiled module from disk (inspired by discussion on #3787) While I was at it I also upped some limits to be able to instantiate cfallin's `spidermonkey.wasm`.	3 years ago
Alex Crichton	520a7f26d7	Move function names out of `Module` (#3789 ) * Move function names out of `Module` This commit moves function names in a module out of the `wasmtime_environ::Module` type and into separate sections stored in the final compiled artifact. Spurred on by #3787 to look at module load times I noticed that a huge amount of time was spent in deserializing this map. The `spidermonkey.wasm` file, for example, has a 3MB name section which is a lot of unnecessary data to deserialize at module load time. The names of functions are now split out into their own dedicated section of the compiled artifact and metadata about them is stored in a more compact format at runtime by avoiding a `BTreeMap` and instead using a sorted array. Overall this improves deserialize times by up to 80% for modules with large name sections since the name section is no longer deserialized at load time and it's lazily paged in as names are actually referenced. * Fix a typo * Fix compiled module determinism Need to not only sort afterwards but also first to ensure the data of the name section is consistent.	3 years ago
Peter Huene	41eb225765	Add the instance allocation strategy to generated fuzzing configs. (#3780 ) * Add the instance allocation strategy to generated fuzzing configs. This commit adds support for generating configs with arbitrary instance allocation strategies. With this, the pooling allocator will be fuzzed as part of the existing fuzz targets. * Refine maximum constants for arbitrary module limits. * Add an `instantiate-many` fuzz target. This commit adds a new `instantiate-many` fuzz target that will attempt to instantiate and terminate modules in an arbitrary order. It generates up to 5 modules, from which a random sequence of instances will be created. The primary benefactor of this fuzz target is the pooling instance allocator. * Allow no aliasing in generated modules when using the pooling allocator. This commit prevents aliases in the generated modules as they might count against the configured import limits of the pooling allocator. As the existing module linking proposal implementation will eventually be deprecated in favor of the component model proposal, it isn't very important that we test aliases in generated modules with the pooling allocator. * Improve distribution of memory config in fuzzing. The previous commit attempted to provide a 32-bit upper bound to 64-bit arbitrary values, which skewed the distribution heavily in favor of the upper bound. This commit removes the constraint and instead uses arbitrary 32-bit values that are converted to 64-bit values in the `Arbitrary` implementation.	3 years ago
Alex Crichton	027dea549a	Fuzz using precompiled modules on CI (#3788 ) In working on #3787 I see now that our coverage of loading precompiled files specifically is somewhat lacking, so this adds a config option to the fuzzers where, if enabled, will round-trip all compiled modules through the filesystem to test out the mmapped-file case.	3 years ago
Dan Gohman	f2bf254a79	Update to cap-std 0.24.1, fixing compilation on Right nightly. (#3786 ) Other than doc updates, this just contains bytecodealliance/cap-std#235, a fix for compilation errors on Rust nightly that look like this: ``` error[E0308]: mismatched types --> cap-primitives/src/fs/via_parent/rename.rs:22:58 \| 22 \| let (old_dir, old_basename) = open_parent(old_start, &old_path)?; \| ^^^^^^^^^ expected struct `Path`, found opaque type \| ::: cap-primitives/src/rustix/fs/dir_utils.rs:67:48 \| 67 \| pub(crate) fn strip_dir_suffix(path: &Path) -> impl Deref<Target = Path> + '_ { \| ------------------------------ the found opaque type \| = note: expected struct `Path` found opaque type `impl Deref<Target = Path>` ```	3 years ago
Chris Fallin	39a52ceb4f	Implement lazy funcref table and anyfunc initialization. (#3733 ) During instance initialization, we build two sorts of arrays eagerly: - We create an "anyfunc" (a `VMCallerCheckedAnyfunc`) for every function in an instance. - We initialize every element of a funcref table with an initializer to a pointer to one of these anyfuncs. Most instances will not touch (via call_indirect or table.get) all funcref table elements. And most anyfuncs will never be referenced, because most functions are never placed in tables or used with `ref.func`. Thus, both of these initialization tasks are quite wasteful. Profiling shows that a significant fraction of the remaining instance-initialization time after our other recent optimizations is going into these two tasks. This PR implements two basic ideas: - The anyfunc array can be lazily initialized as long as we retain the information needed to do so. For now, in this PR, we just recreate the anyfunc whenever a pointer is taken to it, because doing so is fast enough; in the future we could keep some state to know whether the anyfunc has been written yet and skip this work if redundant. This technique allows us to leave the anyfunc array as uninitialized memory, which can be a significant savings. Filling it with initialized anyfuncs is very expensive, but even zeroing it is expensive: e.g. in a large module, it can be >500KB. - A funcref table can be lazily initialized as long as we retain a link to its corresponding instance and function index for each element. A zero in a table element means "uninitialized", and a slowpath does the initialization. Funcref tables are a little tricky because funcrefs can be null. We need to distinguish "element was initially non-null, but user stored explicit null later" from "element never touched" (ie the lazy init should not blow away an explicitly stored null). We solve this by stealing the LSB from every funcref (anyfunc pointer): when the LSB is set, the funcref is initialized and we don't hit the lazy-init slowpath. We insert the bit on storing to the table and mask it off after loading. We do have to set up a precomputed array of `FuncIndex`s for the table in order for this to work. We do this as part of the module compilation. This PR also refactors the way that the runtime crate gains access to information computed during module compilation. Performance effect measured with in-tree benches/instantiation.rs, using SpiderMonkey built for WASI, and with memfd enabled: ``` BEFORE: sequential/default/spidermonkey.wasm time: [68.569 us 68.696 us 68.856 us] sequential/pooling/spidermonkey.wasm time: [69.406 us 69.435 us 69.465 us] parallel/default/spidermonkey.wasm: with 1 background thread time: [69.444 us 69.470 us 69.497 us] parallel/default/spidermonkey.wasm: with 16 background threads time: [183.72 us 184.31 us 184.89 us] parallel/pooling/spidermonkey.wasm: with 1 background thread time: [69.018 us 69.070 us 69.136 us] parallel/pooling/spidermonkey.wasm: with 16 background threads time: [326.81 us 337.32 us 347.01 us] WITH THIS PR: sequential/default/spidermonkey.wasm time: [6.7821 us 6.8096 us 6.8397 us] change: [-90.245% -90.193% -90.142%] (p = 0.00 < 0.05) Performance has improved. sequential/pooling/spidermonkey.wasm time: [3.0410 us 3.0558 us 3.0724 us] change: [-95.566% -95.552% -95.537%] (p = 0.00 < 0.05) Performance has improved. parallel/default/spidermonkey.wasm: with 1 background thread time: [7.2643 us 7.2689 us 7.2735 us] change: [-89.541% -89.533% -89.525%] (p = 0.00 < 0.05) Performance has improved. parallel/default/spidermonkey.wasm: with 16 background threads time: [147.36 us 148.99 us 150.74 us] change: [-18.997% -18.081% -17.285%] (p = 0.00 < 0.05) Performance has improved. parallel/pooling/spidermonkey.wasm: with 1 background thread time: [3.1009 us 3.1021 us 3.1033 us] change: [-95.517% -95.511% -95.506%] (p = 0.00 < 0.05) Performance has improved. parallel/pooling/spidermonkey.wasm: with 16 background threads time: [49.449 us 50.475 us 51.540 us] change: [-85.423% -84.964% -84.465%] (p = 0.00 < 0.05) Performance has improved. ``` So an improvement of something like 80-95% for a very large module (7420 functions in its one funcref table, 31928 functions total).	3 years ago
Peter Huene	1b27508a42	Fix incorrect use of `MemoryIndex` in the pooling allocator. (#3782 ) This commit corrects a few places where `MemoryIndex` was used and treated like a `DefinedMemoryIndex` in the pooling instance allocator. When the unstable `multi-memory` proposal is enabled, it is possible to cause a newly allocated instance to use an incorrect base address for any defined memories by having the module being instantiated also import a memory. This requires enabling the unstable `multi-memory` proposal, configuring the use of the pooling instance allocator (not the default), and then configuring the module limits to allow imported memories (also not the default). The fix is to replace all uses of `MemoryIndex` with `DefinedMemoryIndex` in the pooling instance allocator. Several `debug_assert!` have also been updated to `assert!` to sanity check the state of the pooling allocator even in release builds.	3 years ago
Ulrich Weigand	10198553c7	ISLE: Common accessors for some insn data fields (#3781 ) Add accessors to prelude.isle to access data fields of `func_addr` and `symbol_value` instructions. These are based on similar versions I had added to the s390x back-end, but are a bit more straightforward to use. - func_ref_data: Extract SigRef, ExternalName, and RelocDistance fields given a FuncRef. - symbol_value_data: Extract ExternalName, RelocDistance, and offset fields given a GlobalValue representing a Symbol. - reloc_distance_near: Test for RelocDistance::Near. The s390x back-end is changed to use these common versions. Note that this exposed a bug in common isle code: This extractor: (extractor (load_sym inst) (and inst (load _ (def_inst (symbol_value (symbol_value_data _ (reloc_distance_near) offset))) (i64_from_offset (memarg_symbol_offset_sum <offset _))))) would raise an assertion in sema.rs due to a supposed cycle in extractor definitions. But there was no actual cycle, it was simply that the extractor tree refers twice to the `insn_data` extractor (once via the `load` and once via the `symbol_value` extractor). Fixed by checking for pre-existing definitions only along one path in the tree, not across the whole tree.	3 years ago

1 2 3 4 5 ...

9599 Commits (434e35c490cf174cfdd48c6d8010b5ad5f401ac5) All Branches Search

9599 Commits (434e35c490cf174cfdd48c6d8010b5ad5f401ac5)

All Branches