Tree:
3176f03ad5
cfallin/lucet-pr612-base
fitzgen-patch-1
main
pch/bound_tcp_userland_buffer
pch/bump_wasm_tools_210
pch/cli_wasi_legacy
pch/component_call_hooks
pch/resource_table
pch/resource_table_2
pch/upstream_wave
release-0.32.0
release-0.33.0
release-0.34.0
release-0.35.0
release-0.36.0
release-0.37.0
release-0.38.0
release-0.39.0
release-0.40.0
release-1.0.0
release-10.0.0
release-11.0.0
release-12.0.0
release-13.0.0
release-14.0.0
release-15.0.0
release-16.0.0
release-17.0.0
release-18.0.0
release-19.0.0
release-2.0.0
release-20.0.0
release-21.0.0
release-22.0.0
release-23.0.0
release-24.0.0
release-3.0.0
release-4.0.0
release-5.0.0
release-6.0.0
release-7.0.0
release-8.0.0
release-9.0.0
revert-9191-trevor/upgrade-regalloc
revert-union-find
stable-v0.26
trevor/fuzz-pcc
trevor/hyper-rc4
trevor/io-error-interface
0.2.0
0.3.0
cranelift-v0.31.0
cranelift-v0.32.0
cranelift-v0.33.0
cranelift-v0.34.0
cranelift-v0.35.0
cranelift-v0.36.0
cranelift-v0.37.0
cranelift-v0.39.0
cranelift-v0.40.0
cranelift-v0.41.0
cranelift-v0.42.0
cranelift-v0.43.0
cranelift-v0.43.1
cranelift-v0.44.0
cranelift-v0.45.0
cranelift-v0.46.0
cranelift-v0.46.1
cranelift-v0.60.0
cranelift-v0.61.0
cranelift-v0.62.0
cranelift-v0.69.0
dev
filecheck-v0.0.1
minimum-viable-wasi-proxy-serve
v0.10.0
v0.11.0
v0.12.0
v0.15.0
v0.16.0
v0.17.0
v0.18.0
v0.19.0
v0.2.0
v0.20.0
v0.21.0
v0.22.0
v0.22.1
v0.23.0
v0.24.0
v0.25.0
v0.26.0
v0.26.1
v0.27.0
v0.28.0
v0.29.0
v0.3.0
v0.30.0
v0.31.0
v0.32.0
v0.32.1
v0.33.0
v0.33.1
v0.34.0
v0.34.1
v0.34.2
v0.35.0
v0.35.1
v0.35.2
v0.35.3
v0.36.0
v0.37.0
v0.38.0
v0.38.1
v0.38.2
v0.38.3
v0.39.0
v0.39.1
v0.4.0
v0.40.0
v0.40.1
v0.6.0
v0.8.0
v0.9.0
v1.0.0
v1.0.1
v1.0.2
v10.0.0
v10.0.1
v10.0.2
v11.0.0
v11.0.1
v11.0.2
v12.0.0
v12.0.1
v12.0.2
v13.0.0
v13.0.1
v14.0.0
v14.0.1
v14.0.2
v14.0.3
v14.0.4
v15.0.0
v15.0.1
v16.0.0
v17.0.0
v17.0.1
v17.0.2
v17.0.3
v18.0.0
v18.0.1
v18.0.2
v18.0.3
v18.0.4
v19.0.0
v19.0.1
v19.0.2
v2.0.0
v2.0.1
v2.0.2
v20.0.0
v20.0.1
v20.0.2
v21.0.0
v21.0.1
v22.0.0
v23.0.0
v23.0.1
v23.0.2
v24.0.0
v3.0.0
v3.0.1
v4.0.0
v4.0.1
v5.0.0
v5.0.1
v6.0.0
v6.0.1
v6.0.2
v7.0.0
v7.0.1
v8.0.0
v8.0.1
v9.0.0
v9.0.1
v9.0.2
v9.0.3
v9.0.4
${ noResults }
18 Commits (3176f03ad50e43711e7400f0585ad9b6629c53ca)
Author | SHA1 | Message | Date |
---|---|---|---|
Alex Crichton |
61e11a6c53
|
Remove usage of `BTreeMap` for compiler flags (#7287)
* Remove usage of `BTreeMap` for compiler flags No need for a datastructure here really, a simple list with static strings works alright. * Fix winch compile and a warning * Fix test compile |
1 year ago |
Saúl Cabrera |
4f47f3ecaf
|
winch: Add a subset of known libcalls and improve call emission (#7228)
* winch: Add known a subset of known libcalls and improve call emission This change is a follow up to: - https://github.com/bytecodealliance/wasmtime/pull/7155 - https://github.com/bytecodealliance/wasmtime/pull/7035 One of the objectives of this change is to make it easy to emit function calls at the MacroAssembler layer, for cases in which it's challenging to know ahead-of-time if a particular functionality can be achieved natively (e.g. rounding and SSE4.2). The original implementation of function call emission, made this objective difficult to achieve and it was also difficult to reason about. I decided to simplify the overall approach to function calls as part of this PR; in essence, the `call` module now exposes a single function `FnCall::emit` which is reponsible of gathtering the dependencies and orchestrating the emission of the call. This new approach deliberately avoids holding any state regarding the function call for simplicity. This change also standardizes the usage of `Callee` as the main entrypoint for function call emission, as of this change 4 `Callee` types exist (`Local`, `Builtin`, `Import`, `FuncRef`), each callee kind is mappable to a `CalleeKind` which is the materialized version of a callee which Cranelift understands. This change also moves the creation of the `BuiltinFunctions` to the `ISA` level given that they can be safely used accross multiple function compilations. Finally, this change also introduces support for some of the "well-known" libcalls and hooks those libcalls at the `MacroAssembler::float_round` callsite. -- prtest:full * Review comments * Remove unnecessary `into_iter` * Fix remaining lifetime parameter names |
1 year ago |
Saúl Cabrera |
4b288ba88d
|
winch(x64): Call indirect (#7100)
* winch(x64): Call indirect This change adds support for the `call_indirect` instruction to Winch. Libcalls are a pre-requisite for supporting `call_indirect` in order to lazily initialy funcrefs. This change adds support for libcalls to Winch by introducing a `BuiltinFunctions` struct similar to Cranelift's `BuiltinFunctionSignatures` struct. In general, libcalls are handled like any other function call, with the only difference that given that not all the information to fulfill the function call might be known up-front, control is given to the caller for finalizing the call. The introduction of function references also involves dealing with pointer-sized loads and stores, so this change also adds the required functionality to `FuncEnv` and `MacroAssembler` to be pointer aware, making it straight forward to derive an `OperandSize` or `WasmType` from the target's pointer size. Finally, given the complexity of the call_indirect instrunction, this change bundles an improvement to the register allocator, allowing it to track the allocatable vs non-allocatable registers, this is done to avoid any mistakes when allocating/de-allocating registers that are not alloctable. -- prtest:full * Address review comments * Fix typos * Better documentation for `new_unchecked` * Introduce `max` for `BitSet` * Make allocatable property `u64` * winch(calls): Overhaul `FnCall` This commit simplifies `FnCall`'s interface making its usage more uniform throughout the compiler. In summary, this change: * Avoids side effects in the `FnCall::new` constructor, and also makes it the only constructor. * Exposes `FnCall::save_live_registers` and `FnCall::calculate_call_stack_space` to calculate the stack space consumed by the call and so that the caller can decide which one to use at callsites depending on their use-case. * tests: Fix regset tests |
1 year ago |
Nick Fitzgerald |
868f0c381c
|
Wasmtime: Add support for Wasm tail calls (#6774)
* Wasmtime: Add support for Wasm tail calls This adds the `Config::wasm_tail_call` method and `--wasm-features tail-call` CLI flag to enable the Wasm tail calls proposal in Wasmtime. This PR is mostly just plumbing and enabling tests, since all the prerequisite work (Wasmtime trampoline overhauls and Cranelift tail calls) was completed in earlier pull requests. When Wasm tail calls are enabled, Wasm code uses the `tail` calling convention. The `tail` calling convention is known to cause a 1-7% slow down for regular code that isn't using tail calls, which is why it isn't used unconditionally. This involved shepherding `Tunables` through to Wasm signature construction methods. The eventual plan is for the `tail` calling convention to be used unconditionally, but not until the performance regression is addressed. This work is tracked in https://github.com/bytecodealliance/wasmtime/issues/6759 Additionally while our x86-64, aarch64, and riscv64 backends support tail calls, the s390x backend does not support them yet. Attempts to use tail calls on s390x will return errors. Support for s390x is tracked in https://github.com/bytecodealliance/wasmtime/issues/6530 * Store `Tunables` inside the `Compiler` Instead of passing as an argument to every `Compiler` method. * Cranelift: Support "direct" return calls on riscv64 They still use `jalr` instead of `jal` but this allows us to use the `RiscvCall` reloc, which Wasmtime handles. Before we were using `LoadExternalName` which produces an `Abs8` reloc, which Wasmtime intentionally does not handle since that involves patching code at runtime, which makes loading code slower. * Fix tests that assume tail call support on s390x |
1 year ago |
Alex Crichton |
5a6ed0fbd2
|
Implement component model resources in Wasmtime (#6691)
* Fix signatures registered with modules-in-components This commit fixes a minor issue in `FunctionIndices::link_and_append_code` which previously ended up only filling out the `wasm_to_native_trampolines` field for the first module rather than all the modules. Additionally the first module might have too many entries that encompass all modules instead of just its own entries. The fix in this commit is to refactor this logic to ensure that the necessary maps are present for all translations. While technically a bug that can be surfaced through the embedder API it's pretty obscure. The given test here panics beforehand but succeeds afterwards, but this is moreso prep for some future resource-related work where this map will need persisting into the component metadata side of things. * Initial support for resources Lots of bits and pieces squashed into this commit. Much to be done still. * Start supporting destructors * Get some basic drop tests working Also add a test which requires host-defined drop to be called which isn't working. * Fix rebase issue * Fix a failing test * I am zorthax, destroyer of resources * Remove a branch in compiled code No need to check for a null funcref when we already know ahead of time if it's ever going to be null or not. * Fix the test suite * Add embedder API to destroy resources * Add TODO for factc * Fix a warning and leave a comment * Integrate resources into `Type` Plumb around dynamic information about resource types. * Implement `Val::Own` * Implement reentrance check for destructors Implemented both in the raw wasm intrinsic as well as the host. * Use cast instead of transmute * Fill out some cranelift-shared comments * Update codegen for resource.drop shim The MAY_ENTER flag must always be checked, regardless of whether there's an actual destructor or not. * Update wasm-tools crates to latest `main` * Update resource.drop binary format * Add some docs * Implement dynamic tracking for borrow resources Not actually hooked up anywhere but this should at least be a first stab at an implementation of the spec. * Remove git overrides * Remove no-longer-needed arms in wit-bindgen * Prepare for mutability in `LiftContext` * Change `&LiftContext` to `&mut LiftContext` * Remove `store: &'a StoreOpaque` from `LiftContext`, instead storing just `memory: &'a [u8]` * Refactor methods to avoid needing the entire `StoreOpaque` This'll enable `LiftContext` to store `&'a mut ResourceTable` in an upcoming commit to refer to the host's resources. * Lowering a borrow is infallible * Use `ResourceAny` for both own/borrow Rename `Val::Own` to `Val::Resource` accordingly. * Initial implementation of borrowed resources Lots of juggling of contexts here and there to try and get everything working but this is hopefully a faithful implementation. Tests not implemented yet and will come next and additionally likely update implementation details as issues are weeded out. * Add a suite of tests for borrowing resources Code coverage was used to ensure that almost all of the various paths through the code are taken to ensure all the basic bases are covered. There's probably still lurking bugs, but this should be a solid enough base to start from hopefully. * Fill in an issue for bindgen todo * Add docs, still more to go * Fill out more documentation * Fill out a test TODO * Update the host `Resource<T>` type * Add docs everywhere * Don't require a `Store` for creating the resource or getting the representation. The latter point is the main refactoring in this commit. This is done in preparation for `bindgen!` to use this type where host bindings generally do not have access to the store. * Document `ResourceAny` * Debug assert dtor is non-null * Review comments on loading libcalls * Update some comments * Update a comment * Fix some typos * Add a test that host types are the same when guest types differ * Fix some typos * Thread things through a bit less * Undo CompileKey-related changes * Gate an async function on the async feature * Fix doc links * Skip resources tests in miri They all involve compilation which takes too long and doesn't currently work |
1 year ago |
Luna P-C |
92024ad117
|
Function references (#5288)
* Make wasmtime-types type check * Make wasmtime-environ type check. * Make wasmtime-runtime type check * Make cranelift-wasm type check * Make wasmtime-cranelift type check * Make wasmtime type check * Make wasmtime-wast type check * Make testsuite compile * Address Luna's comments * Restore compatibility with effect-handlers/wasm-tools#func-ref-2 * Add function refs feature flag; support testing * Provide function references support in helpers - Always support Index in blocktypes - Support Index as table type by pretending to be Func - Etc * Implement ref.as_non_null * Add br_on_null * Update Cargo.lock to use wasm-tools with peek This will ultimately be reverted when we refer to wasm-tools#function-references, which doesn't have peek, but does have type annotations on CallRef * Add call_ref * Support typed function references in ref.null * Implement br_on_non_null * Remove extraneous flag; default func refs false * Use IndirectCallToNull trap code for call_ref * Factor common call_indirect / call_ref into a fn * Remove copypasta clippy attribute / format * Add a some more tests for typed table instructions There certainly need to be many more, but this at least catches the bugs fixed in the next commit * Fix missing typed cases for table_grow, table_fill * Document trap code; remove answered question * Mark wasm-tools to wasmtime reftype infallible * Fix reversed conditional * Scope externref/funcref shorthands within WasmRefType * Merge with upstream * Make wasmtime compile again * Fix warnings * Remove Bot from the type algebra * Fix table tests. `wast::Cranelift::spec::function_references::table` `wast::Cranelift::spec::function_references::table_pooling` * Fix table{get,set} tests. ``` wast::Cranelift::misc::function_references::table_get wast::Cranelift::misc::function_references::table_get_pooling wast::Cranelift::misc::function_references::table_set wast::Cranelift::misc::function_references::table_set_pooling ``` * Insert subtype check to fix local_get tests. ``` wast::Cranelift::spec::function_references::local_get wast::Cranelift::spec::function_references::local_get_pooling ``` * Fix compilation of `br_on_non_null`. The branch destinations were the other way round... :-) Fixes the following test failures: ``` wast::Cranelift::spec::function_references::br_on_non_null wast::Cranelift::spec::function_references::br_on_non_null_pooling ``` * Fix ref_as_non_null tests. The test was failing due to the wrong error message being printed. As per upstream folks' suggest we were using the trap code `IndirectCallToNull`, but it produces an unexpected error message. This commit reinstates the `NullReference` trap code. It produces the expected error message. We will have to chat with the maintainers upstream about how to handle these "test failures". Fixes the following test failures: ``` wast::Cranelift::spec::function_references::ref_as_non_null wast::Cranelift::spec::function_references::ref_as_non_null_pooling ``` * Fix a call_ref regression. * Fix global tests. Extend `is_matching_assert_invalid_error_message` to circumvent the textual error message failure. Fixes the following test failures: ``` wast::Cranelift::spec::function_references::global wast::Cranelift::spec::function_references::global_pooling ``` * Cargo update * Update * Spell out some cases in match_val * Disgusting hack to subvert limitations of type reconstruction. In the function `wasmtime::values::Val::ty()` attempts to reconstruct the type of its underlying value purely based on the shape of the value. With function references proposal this sort of reconstruction is no longer complete as a source reference type may have been nullable. Nullability is not inferrable by looking at the shape of the runtime object alone. Consequently, the runtime cannot reconstruct the type for `Val::FuncRef` and `Val::ExternRef` by looking at their respective shapes. * Address workflows comments. * null reference => null_reference for CLIF parsing compliance. * Delete duplicate-loads-dynamic-memory-egraph (again) * Idiomatic code change. * Nullability subtyping + fix non-null storage check. This commit removes the `hacky_eq` check in `func.rs`. Instead it is replaced by a subtype check. This subtype check occurs in `externals.rs` too. This commit also fixes a bug. Previously, it was possible to store a null reference into a non-null table cell. I have added to new test cases for this bug: one for funcrefs and another for externrefs. * Trigger unimplemented for typed function references. Format values.rs * run cargo fmt * Explicitly match on HeapType::Extern. * Address cranelift-related feedback * Remove PartialEq,Eq from ValType, RefType, HeapType. * Pin wasmparser to a fairly recent commit. * Run cargo fmt * Ignore tail call tests. * Remove garbage * Revert changes to wasmtime public API. * Run cargo fmt * Get more CI passing (#19) * Undo Cargo.lock changes * Fix build of cranelift tests * Implement link-time matches relation. Disable tests failing due to lack of public API support. * Run cargo fmt * Run cargo fmt * Initial implementation of eager table initialization * Tidy up eager table initialisation * Cargo fmt * Ignore type-equivalence test * Replace TODOs with descriptive comments. * Various changes found during review (#21) * Clarify a comment This isn't only used for null references * Resolve a TODO in local init Don't initialize non-nullable locals to null, instead skip initialization entirely and wasm validation will ensure it's always initialized in the scope where it's used. * Clarify a comment and skipping the null check. * Remove a stray comment * Change representation of `WasmHeapType` Use a `SignatureIndex` instead of a `u32` which while not 100% correct should be more correct. This additionally renames the `Index` variant to `TypedFunc` to leave space for future types which aren't functions to not all go into an `Index` variant. This required updates to Winch because `wasmtime_environ` types can no longer be converted back to their `wasmparser` equivalents. Additionally this means that all type translation needs to go through some form of context to resolve indices which is now encapsulated in a `TypeConvert` trait implemented in various locations. * Refactor table initialization Reduce some duplication and simplify some data structures to have a more direct form of table initialization and a bit more graceful handling of element-initialized tables. Additionally element-initialize tables are now treated the same as if there's a large element segment initializing them. * Clean up some unrelated chagnes * Simplify Table bindings slightly * Remove a no-longer-needed TODO * Add a FIXME for `SignatureIndex` in `WasmHeapType` * Add a FIXME for panicking on exposing function-references types * Fix a warning on nightly * Fix tests for winch and cranelift * Cargo fmt * Fix arity mismatch in aarch64/abi --------- Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk> Co-authored-by: Daniel Hillerström <daniel.hillerstrom@huawei.com> Co-authored-by: Alex Crichton <alex@alexcrichton.com> |
1 year ago |
Saúl Cabrera |
afde47c214
|
winch: Drop `FuncEnv` trait (#6443)
This commit is a small cleanup to drop the usage of the `FuncEnv` trait. In https://github.com/bytecodealliance/wasmtime/pull/6358, we agreed on making `winch-codegen` directly depend on `wasmtime-environ`. Introducing a direct relatioship between `winch-codegen` and `wasmtime-environ` means that the `FuncEnv` trait is no longer serving its original purpose, and we can drop the usage of the trait and use the types exposed from `winch-codegen` directly instead. Even though this change drops the `FuncEnv` trait, it still keeps a `FuncEnv` struct, which is used during code generation. |
1 year ago |
Saúl Cabrera |
7c6ec0ff1c
|
winch: Append traps to final object (#6387)
This change is a follow-up to https://github.com/bytecodealliance/wasmtime/pull/6298, in which the function traps were not added to the final object. |
2 years ago |
Saúl Cabrera |
20c5836295
|
winch: Implement new trampolines (#6358)
* winch: Implement new trampolines This change is a follow-up to https://github.com/bytecodealliance/wasmtime/pull/6262, in which the new trampolines, described [here](https://github.com/bytecodealliance/rfcs/blob/main/accepted/tail-calls.md#new-trampolines-and-vmcallercheckedanyfunc-changes), were introduced to Wasmtime. This change, focuses on the `array-to-wasm`, `native-to-wasm` and `wasm-to-native` trampolines to restore Winch's working state prior to the introduction of the new trampolines. It's worth noting that the new approach for trampolines make it easier to support the `TypedFunc` API in Winch. Prior to the introduction of the new trampolines, it was not obvious how to approach it. This change also introduces a pinned register that will hold the `VMContext` pointer, which is loaded in the `*-to-wasm` trampolines; the `VMContext` register is a pre-requisite to this change to support the `wasm-to-native` trampolines. Lastly, with the introduction of the `VMContext` register and the `wasm-to-native` trampolines, this change also introduces support for calling function imports, which is a variation of the already existing calls to locally defined functions. The other notable piece of this change aside from the trampolines is `winch-codegen`'s dependency on `wasmtime-environ`. Winch is so closely tied to the concepts exposed by the wasmtime crates that it makes sense to tie them together, even though the separation provides some advantages like easier testing in some cases, in the long run, there's probably going to be less need to test Winch in isolation and rather we'd rely more on integration style tests which require all of Wasmtime pieces anyway (fuzzing, spec tests, etc). This change doesn't update the existing implmenetation of `winch_codegen::FuncEnv`, but the intention is to update that part after this change. prtest:full * tests: Ignore miri in Winch integration tests * Remove hardcoded alignment and addend |
2 years ago |
Saúl Cabrera |
1cbca5a5c4
|
winch: Handle relocations and traps (#6298)
* winch: Handle relocations and traps This change introduces handling of traps and relocations in Winch, which was left out in https://github.com/bytecodealliance/wasmtime/pull/6119. In order to so, this change moves the `CompiledFunction` struct to the `wasmtime-cranelift-shared` crate, allowing Cranelift and Winch to operate on a single, shared representation, following some of the ideas discussed in https://github.com/bytecodealliance/wasmtime/pull/5944. Even though Winch doesn't rely on all the fields of `CompiledFunction`, it eventually will. With the addition of relocations and traps it started to be more evident that even if we wanted to have different representations of a compiled function, they would end up being very similar. This change also consolidates where the `traps` and `relocations` of the `CompiledFunction` get created, by introducing a constructor that operates on a `MachBufferFinalized<Final>`, esentially encapsulating this process in a single place for both Winch and Cranelift. * Rework the shared `CompiledFunction` This commit reworks the shared `CompiledFunction` struct. The compiled function now contains the essential pieces to derive all the information to create the final object file and to derive the debug information for the function. This commit also decouples the dwarf emission process by introducing a `metadata` field in the `CompiledFunction` struct, which is used as the central structure for dwarf emission. |
2 years ago |
Alex Crichton |
fd6cc9a116
|
Slightly shrink compiled wasm modules (#6302)
* Slightly shrink compiled wasm modules This commit shuffles trampolines to the end of a compiled ELF file instead of interspersed throughout. Additionally trampolines are no longer given a higher alignment requirement than is required by the ISA as is given to functions since they're not perf critical. The savings here are quite minor, only 0.3% locally on spidermonkey.wasm. * Fix winch compile * Return a more descriptive `FunctionAlignment` from `TargetIsa` * Push alignment further into Cranelift Remove the need for taking a function's alignment and an ISA's alignment and combining them, instead only using the function's alignment as the source of truth. * Review comments |
2 years ago |
Nick Fitzgerald |
913efdf24d
|
wasmtime: Overhaul trampolines (#6262)
This commit splits `VMCallerCheckedFuncRef::func_ptr` into three new function pointers: `VMCallerCheckedFuncRef::{wasm,array,native}_call`. Each one has a dedicated calling convention, so callers just choose the version that works for them. This is as opposed to the previous behavior where we would chain together many trampolines that converted between calling conventions, sometimes up to four on the way into Wasm and four more on the way back out. See [0] for details. [0] https://github.com/bytecodealliance/rfcs/blob/main/accepted/tail-calls.md#a-review-of-our-existing-trampolines-calling-conventions-and-call-paths Thanks to @bjorn3 for the initial idea of having multiple function pointers for different calling conventions. This is generally a nice ~5-10% speed up to our call benchmarks across the board: both Wasm-to-host and host-to-Wasm. The one exception is typed calls from Wasm to the host, which have a minor regression. We hypothesize that this is because the old hand-written assembly trampolines did not maintain a call frame and do a tail call, but the new Cranelift-generated trampolines do maintain a call frame and do a regular call. The regression is only a couple nanoseconds, which seems well-explained by these differences explain, and ultimately is not a big deal. However, this does lead to a ~5% code size regression for compiled modules. Before, we compiled a trampoline per escaping function's signature and we deduplicated these trampolines by signature. Now we compile two trampolines per escaping function: one for if the host calls via the array calling convention and one for it the host calls via the native calling convention. Additionally, we compile a trampoline for every type in the module, in case there is a native calling convention function from the host that we `call_indirect` of that type. Much of this is in the `.eh_frame` section in the compiled module, because each of our trampolines needs an entry there. Note that the `.eh_frame` section is not required for Wasmtime's correctness, and you can disable its generation to shrink compiled module code size; we just emit it to play nice with external unwinders and profilers. We believe there are code size gains available for follow up work to offset this code size regression in the future. Backing up a bit: the reason each Wasm module needs to provide these Wasm-to-native trampolines is because `wasmtime::Func::wrap` and friends allow embedders to create functions even when there is no compiler available, so they cannot bring their own trampoline. Instead the Wasm module has to supply it. This in turn means that we need to look up and patch in these Wasm-to-native trampolines during roughly instantiation time. But instantiation is super hot, and we don't want to add more passes over imports or any extra work on this path. So we integrate with `wasmtime::InstancePre` to patch these trampolines in ahead of time. Co-Authored-By: Jamey Sharp <jsharp@fastly.com> Co-Authored-By: Alex Crichton <alex@alexcrichton.com> prtest:full |
2 years ago |
Kevin Rizzo |
3a92aa3d0a
|
winch: Initial integration with wasmtime (#6119)
* Adding in trampoline compiling method for ISA * Adding support for indirect call to memory address * Refactoring frame to externalize defined locals, so it removes WASM depedencies in trampoline case * Adding initial version of trampoline for testing * Refactoring trampoline to be re-used by other architectures * Initial wiring for winch with wasmtime * Add a Wasmtime CLI option to select `winch` This is effectively an option to select the `Strategy` enumeration. * Implement `Compiler::compile_function` for Winch Hook this into the `TargetIsa::compile_function` hook as well. Currently this doesn't take into account `Tunables`, but that's left as a TODO for later. * Filling out Winch append_code method * Adding back in changes from previous branch Most of these are a WIP. It's missing trampolines for x64, but a basic one exists for aarch64. It's missing the handling of arguments that exist on the stack. It currently imports `cranelift_wasm::WasmFuncType` since it's what's passed to the `Compiler` trait. It's a bit awkward to use in the `winch_codegen` crate since it mostly operates on `wasmparser` types. I've had to hack in a conversion to get things working. Long term, I'm not sure it's wise to rely on this type but it seems like it's easier on the Cranelift side when creating the stub IR. * Small API changes to make integration easier * Adding in new FuncEnv, only a stub for now * Removing unneeded parts of the old PoC, and refactoring trampoline code * Moving FuncEnv into a separate file * More comments for trampolines * Adding in winch integration tests for first pass * Using new addressing method to fix stack pointer error * Adding test for stack arguments * Only run tests on x86 for now, it's more complete for winch * Add in missing documentation after rebase * Updating based on feedback in draft PR * Fixing formatting on doc comment for argv register * Running formatting * Lock updates, and turning on winch feature flags during tests * Updating configuration with comments to no longer gate Strategy enum * Using the winch-environ FuncEnv, but it required changing the sig * Proper comment formatting * Removing wasmtime-winch from dev-dependencies, adding the winch feature makes this not necessary * Update doc attr to include winch check * Adding winch feature to doc generation, which seems to fix the feature error in CI * Add the `component-model` feature to the cargo doc invocation in CI To match the metadata used by the docs.rs invocation when building docs. * Add a comment clarifying the usage of `component-model` for docs.rs * Correctly order wasmtime-winch and winch-environ in the publish script * Ensure x86 test dependencies are included in cfg(target_arch) * Further constrain Winch tests to x86_64 _and_ unix --------- Co-authored-by: Alex Crichton <alex@alexcrichton.com> Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com> |
2 years ago |
Kevin Rizzo |
013b35ff32
|
winch: Refactoring wasmtime compiler integration pieces to share more between Cranelift and Winch (#5944)
* Enable the native target by default in winch Match cranelift-codegen's build script where if no architecture is explicitly enabled then the host architecture is implicitly enabled. * Refactor Cranelift's ISA builder to share more with Winch This commit refactors the `Builder` type to have a type parameter representing the finished ISA with Cranelift and Winch having their own typedefs for `Builder` to represent their own builders. The intention is to use this shared functionality to produce more shared code between the two codegen backends. * Moving compiler shared components to a separate crate * Restore native flag inference in compiler building This fixes an oversight from the previous commits to use `cranelift-native` to infer flags for the native host when using default settings with Wasmtime. * Move `Compiler::page_size_align` into wasmtime-environ The `cranelift-codegen` crate doesn't need this and winch wants the same implementation, so shuffle it around so everyone has access to it. * Fill out `Compiler::{flags, isa_flags}` for Winch These are easy enough to plumb through with some shared code for Wasmtime. * Plumb the `is_branch_protection_enabled` flag for Winch Just forwarding an isa-specific setting accessor. * Moving executable creation to shared compiler crate * Adding builder back in and removing from shared crate * Refactoring the shared pieces for the `CompilerBuilder` I decided to move a couple things around from Alex's initial changes. Instead of having the shared builder do everything, I went back to having each compiler have a distinct builder implementation. I refactored most of the flag setting logic into a single shared location, so we can still reduce the amount of code duplication. With them being separate, we don't need to maintain things like `LinkOpts` which Winch doesn't currently use. We also have an avenue to error when certain flags are sent to Winch if we don't support them. I'm hoping this will make things more maintainable as we build out Winch. I'm still unsure about keeping everything shared in a single crate (`cranelift_shared`). It's starting to feel like this crate is doing too much, which makes it difficult to name. There does seem to be a need for two distinct abstraction: creating the final executable and the handling of shared/ISA flags when building the compiler. I could make them into two separate crates, but there doesn't seem to be enough there yet to justify it. * Documentation updates, and renaming the finish method * Adding back in a default temporarily to pass tests, and removing some unused imports * Fixing winch tests with wrong method name * Removing unused imports from codegen shared crate * Apply documentation formatting updates Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com> * Adding back in cranelift_native flag inferring * Adding new shared crate to publish list * Adding write feature to pass cargo check --------- Co-authored-by: Alex Crichton <alex@alexcrichton.com> Co-authored-by: Saúl Cabrera <saulecabrera@gmail.com> |
2 years ago |
Saúl Cabrera |
94b51cdb17
|
winch: Use cranelift-codegen x64 backend for emission. (#5581)
This change substitutes the string based emission mechanism with cranelift-codegen's x64 backend. This change _does not_: * Introduce new functionality in terms of supported instructions. * Change the semantics of the assembler/macroassembler in terms of the logic to emit instructions. The most notable differences between this change and the previous version are: * Handling of shared flags and ISA-specific flags, which for now are left with the default value. * Simplification of instruction emission per operand size: previously the assembler defined different methods depending on the operand size (e.g. `mov` for 64 bits, and `movl` for 32 bits). This change updates such approach so that each assembler method takes an operand size as a parameter, reducing duplication and making the code more concise and easier to integrate with the x64's `Inst` enum. * Introduction of a disassembler for testing purposes. As of this change, Winch generates the following code for the following test programs: ```wat (module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add )) ``` ```asm 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: b8 0a 00 00 00 mov eax, 0xa 9: 83 c0 14 add eax, 0x14 c: 5d pop rbp d: c3 ret ``` ```wat (module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add )) ``` ```asm 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 83 ec 08 sub rsp, 8 8: 48 c7 04 24 00 00 00 00 mov qword ptr [rsp], 0 10: b8 0a 00 00 00 mov eax, 0xa 15: 89 44 24 04 mov dword ptr [rsp + 4], eax 19: b8 14 00 00 00 mov eax, 0x14 1e: 89 04 24 mov dword ptr [rsp], eax 21: 8b 04 24 mov eax, dword ptr [rsp] 24: 8b 4c 24 04 mov ecx, dword ptr [rsp + 4] 28: 01 c1 add ecx, eax 2a: 48 89 c8 mov rax, rcx 2d: 48 83 c4 08 add rsp, 8 31: 5d pop rbp 32: c3 ret ``` ```wat (module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add )) ``` ```asm 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 83 ec 08 sub rsp, 8 8: 89 7c 24 04 mov dword ptr [rsp + 4], edi c: 89 34 24 mov dword ptr [rsp], esi f: 8b 04 24 mov eax, dword ptr [rsp] 12: 8b 4c 24 04 mov ecx, dword ptr [rsp + 4] 16: 01 c1 add ecx, eax 18: 48 89 c8 mov rax, rcx 1b: 48 83 c4 08 add rsp, 8 1f: 5d pop rbp 20: c3 ret ``` |
2 years ago |
Alex Crichton |
cd53bed898
|
Implement AOT compilation for components (#5160)
* Pull `Module` out of `ModuleTextBuilder` This commit is the first in what will likely be a number towards preparing for serializing a compiled component to bytes, a precompiled artifact. To that end my rough plan is to merge all of the compiled artifacts for a component into one large object file instead of having lots of separate object files and lots of separate mmaps to manage. To that end I plan on eventually using `ModuleTextBuilder` to build one large text section for all core wasm modules and trampolines, meaning that `ModuleTextBuilder` is no longer specific to one module. I've extracted out functionality such as function name calculation as well as relocation resolving (now a closure passed in) in preparation for this. For now this just keeps tests passing, and the trajectory for this should become more clear over the following commits. * Remove component-specific object emission This commit removes the `ComponentCompiler::emit_obj` function in favor of `Compiler::emit_obj`, now renamed `append_code`. This involved significantly refactoring code emission to take a flat list of functions into `append_code` and the caller is responsible for weaving together various "families" of functions and un-weaving them afterwards. * Consolidate ELF parsing in `CodeMemory` This commit moves the ELF file parsing and section iteration from `CompiledModule` into `CodeMemory` so one location keeps track of section ranges and such. This is in preparation for sharing much of this code with components which needs all the same sections to get tracked but won't be using `CompiledModule`. A small side benefit from this is that the section parsing done in `CodeMemory` and `CompiledModule` is no longer duplicated. * Remove separately tracked traps in components Previously components would generate an "always trapping" function and the metadata around which pc was allowed to trap was handled manually for components. With recent refactorings the Wasmtime-standard trap section in object files is now being generated for components as well which means that can be reused instead of custom-tracking this metadata. This commit removes the manual tracking for the `always_trap` functions and plumbs the necessary bits around to make components look more like modules. * Remove a now-unnecessary `Arc` in `Module` Not expected to have any measurable impact on performance, but complexity-wise this should make it a bit easier to understand the internals since there's no longer any need to store this somewhere else than its owner's location. * Merge compilation artifacts of components This commit is a large refactoring of the component compilation process to produce a single artifact instead of multiple binary artifacts. The core wasm compilation process is refactored as well to share as much code as necessary with the component compilation process. This method of representing a compiled component necessitated a few medium-sized changes internally within Wasmtime: * A new data structure was created, `CodeObject`, which represents metadata about a single compiled artifact. This is then stored as an `Arc` within a component and a module. For `Module` this is always uniquely owned and represents a shuffling around of data from one owner to another. For a `Component`, however, this is shared amongst all loaded modules and the top-level component. * The "module registry" which is used for symbolicating backtraces and for trap information has been updated to account for a single region of loaded code holding possibly multiple modules. This involved adding a second-level `BTreeMap` for now. This will likely slow down instantiation slightly but if it poses an issue in the future this should be able to be represented with a more clever data structure. This commit additionally solves a number of longstanding issues with components such as compiling only one host-to-wasm trampoline per signature instead of possibly once-per-module. Additionally the `SignatureCollection` registration now happens once-per-component instead of once-per-module-within-a-component. * Fix compile errors from prior commits * Support AOT-compiling components This commit adds support for AOT-compiled components in the same manner as `Module`, specifically adding: * `Engine::precompile_component` * `Component::serialize` * `Component::deserialize` * `Component::deserialize_file` Internally the support for components looks quite similar to `Module`. All the prior commits to this made adding the support here (unsurprisingly) easy. Components are represented as a single object file as are modules, and the functions for each module are all piled into the same object file next to each other (as are areas such as data sections). Support was also added here to quickly differentiate compiled components vs compiled modules via the `e_flags` field in the ELF header. * Prevent serializing exported modules on components The current representation of a module within a component means that the implementation of `Module::serialize` will not work if the module is exported from a component. The reason for this is that `serialize` doesn't actually do anything and simply returns the underlying mmap as a list of bytes. The mmap, however, has `.wasmtime.info` describing component metadata as opposed to this module's metadata. While rewriting this section could be implemented it's not so easy to do so and is otherwise seen as not super important of a feature right now anyway. * Fix windows build * Fix an unused function warning * Update crates/environ/src/compilation.rs Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> Co-authored-by: Nick Fitzgerald <fitzgen@gmail.com> |
2 years ago |
Saúl Cabrera |
0ca3249afa
|
winch: Add license and update Cargo.toml (#5170)
This commit adds the appropriate license to `winch-codegen` and to `wasmtime-winch`; it also add the authors entry to `winch-codegen` |
2 years ago |
Saúl Cabrera |
835abbcd11
|
Initial skeleton for Winch (#4907)
* Initial skeleton for Winch This commit introduces the initial skeleton for Winch, the "baseline" compiler. This skeleton contains mostly setup code for the ISA, ABI, registers, and compilation environment abstractions. It also includes the calculation of function local slots. As of this commit, the structure of these abstractions looks like the following: +------------------------+ | v +----------+ +-----+ +-----------+-----+-----------------+ | Compiler | --> | ISA | --> | Registers | ABI | Compilation Env | +----------+ +-----+ +-----------+-----+-----------------+ | ^ +------------------------------+ * Compilation environment will hold a reference to the function data * Add basic documentation to the ABI trait * Enable x86 and arm64 in cranelift-codegen * Add reg_name function for x64 * Introduce the concept of a MacroAssembler and Assembler This commit introduces the concept of a MacroAsesembler and Assembler. The MacroAssembler trait will provide a high enough interface across architectures so that each ISA implementation can use their own low-level Assembler implementation to fulfill the interface. Each Assembler will provide a 1-1 mapping to each ISA instruction. As of this commit, only a partial debug implementation is provided for the x64 Assembler. * Add a newtype over PReg Adds a newtype `Reg` over regalloc2::PReg; this ensures that Winch will operate only on the concept of `Reg`. This change is temporary until we have the necessary machinery to share a common Reg abstraction via `cranelift_asm` * Improvements to local calcuation - Add `LocalSlot::addressed_from_sp` - Use `u32` for local slot and local sizes calculation * Add helper methods to ABIArg Adds helper methods to retrieve register and type information from the argument * Make locals_size public in frame * Improve x64 register naming depending on size * Add new methods to the masm interface This commit introduces the ability for the MacroAssembler to reserve stack space, get the address of a given local and perform a stack store based on the concept of `Operand`s. There are several motivating factors to introduce the concept of an Operand: - Make the translation between Winch and Cranelift easier; - Make dispatching from the MacroAssembler to the underlying Assembler - easier by minimizing the amount of functions that we need to define - in order to satisfy the store/load combinations This commit also introduces the concept of a memory address, which essentially describes the addressing modes; as of this commit only one addressing mode is supported. We'll also need to verify that this structure will play nicely with arm64. * Blank masm implementation for arm64 * Implementation of reserve_stack, local_address, store and fp_offset for x64 * Implement function prologue and argument register spilling * Add structopt and wat * Fix debug instruction formatting * Make TargetISA trait publicly accessible * Modify the MacroAssembler finalize siganture to return a slice of strings * Introduce a simple CLI for Winch To be able to compile Wasm programs with Winch independently. Mostly meant for testing / debugging * Fix bug in x64 assembler mov_rm * Remove unused import * Move the stack slot calculation to the Frame This commit moves the calculation of the stack slots to the frame handler abstraction and also includes the calculation of the limits for the function defined locals, which will be used to zero the locals that are not associated to function arguments * Add i32 and i64 constructors to local slots * Introduce the concept of DefinedLocalsRange This commit introduces `DefinedLocalsRange` to track the stack offset at which the function-defined locals start and end; this is later used to zero-out that stack region * Add constructors for int and float registers * Add a placeholder stack implementation * Add a regset abstraction to track register availability Adds a bit set abstraction to track register availability for register allocation. The bit set has no specific knowledge about physical registers, it works on the register's hardware encoding as the source of truth. Each RegSet is expected to be created with the universe of allocatable registers per ISA when starting the compilation of a particular function. * Add an abstraction over register and immediate This is meant to be used as the source for stores. * Add a way to zero local slots and an initial skeletion of regalloc This commit introduces `zero_local_slots` to the MacroAssembler; which ensures that function defined locals are zeroed out when starting the function body. The algorithm divides the defined function locals stack range into 8 byte slots and stores a zero at each address. This process relies on register allocation if the amount of slots that need to be initialized is greater than 1. In such case, the next available register is requested to the register set and it's used to store a 0, which is then stored at every local slot * Update to wasmparser 0.92 * Correctly track if the regset has registers available * Add a result entry to the ABI signature This commuit introduces ABIResult as part of the ABISignature; this struct will track how function results are stored; initially it will consiste of a single register that will be requested to the register allocator at the end of the function; potentially causing a spill * Move zero local slots and add more granular methods to the masm This commit removes zeroing local slots from the MacroAssembler and instead adds more granular methods to it (e.g `zero`, `add`). This allows for better code sharing since most of the work done by the algorithm for zeroing slots will be the same in all targets, except for the binary emissions pieces, which is what gets delegated to the masm * Use wasmparser's visitor API and add initial support for const and add This commit adds initial support for the I32Const and I32 instructions; this involves adding a minimum for register allocation. Note that some regalloc pieces are still incomplete, since for the current set of supported instructions they are not needed. * Make the ty field public in Local * Add scratch_reg to the abi * Add a method to get a particular local from the Frame * Split the compilation environment abstraction This commit splits the compilation environment into two more concise abstractions: 1. CodeGen: the main abstraction for code generation 2. CodeGenContext: abstraction that shares the common pieces for compilation; these pieces are shared between the code generator and the register allocator * Add `push` and `load` to the MacroAssembler * Remove dead code warnings for unused paths * Map ISA features to cranelift-codegen ISA features * Apply formatting * Fix Cargo.toml after a bad rebase * Add component-compiler feature * Use clap instead of structopt * Add winch to publish.rs script * Minor formatting * Add tests to RegSet and fix two bugs when freeing and checking for register availability * Add tests to Stack * Free source register after a non-constant i32 add * Improve comments - Remove unneeded comments - And improve some of the TODO items * Update default features * Drop the ABI generic param and pass the word_size information directly To avoid dealing with dead code warnings this commit passes the word size information directly, since it's the only piece of information needed from the ABI by Codegen until now * Remove dead code This piece of code will be put back once we start integrating Winch with Wasmtime * Remove unused enum variant This variant doesn't get constructed; it should be added back once a backend is added and not enabled by default or when Winch gets integrated into Wasmtime * Fix unused code in regset tests * Update spec testsuite * Switch the visitor pattern for a simpler operator match This commit removes the usage of wasmparser's visitor pattern and instead defaults to a simpler operator matching approach. This removes the complexity of having to define all the visitor trait functions at once. * Use wasmparser's Visitor trait with a different macro strategy This commit puts back wasmparser's Visitor trait, with a sigle; simpler macro, only used for unsupported operators. * Restructure Winch This commit restuructures Winch's parts. It divides the initial approach into three main crates: `winch-codegen`,`wasmtime-winch` and `winch-tools`. `wasmtime-winch` is reponsible for the Wasmtime-Winch integration. `winch-codegen` is solely responsible for code generation. `winch-tools` is CLI tool to compile Wasm programs, mainly for testing purposes. * Refactor zero local slots This commit moves the logic of zeroing local slots from the codegen module into a method with a default implementation in the MacroAssembler trait: `zero_mem_range`. The refactored implementation is very similar to the previous implementation with the only difference that it doesn't allocates a general-purpose register; it instead uses the register allocator to retrieve the scratch register and uses this register to unroll the series of zero stores. * Tie the codegen creation to the ISA ABI This commit makes the relationship between the ISA ABI and the codegen explicit. This allows us to pass down ABI-specific bit and pieces to the codegeneration. In this case the only concrete piece that we need is the ABI word size. * Mark winch as publishable directory * Revamp winch docs This commit ensures that all the code comments in Winch are compliant with the syle used in the rest of Wasmtime's codebase. It also imptoves, generally the quality of the comments in some modules. * Panic when using multi-value when the target is aarch64 Similar to x64, this commit ensures that the abi signature of the current function doesn't use multi-value returns * Document the usage of directives * Use endianness instead of endianess in the ISA trait * Introduce a three-argument form in the MacroAssembler This commit introduces the usage of three-argument form for the MacroAssembler interface. This allows for a natural mapping for architectures like aarch64. In the case of x64, the implementation can simply restrict the implementation asserting for equality in two of the arguments of defaulting to a differnt set of instructions. As of this commit, the implementation of `add` panics if the destination and the first source arguments are not equal; internally the x64 assembler implementation will ensure that all the allowed combinations of `add` are satisfied. The reason for panicking and not emitting a `mov` followed by an `add` for example is simply because register allocation happens right before calling `add`, which ensures any register-to-register moves, if needed. This implementation will evolve in the future and this panic will be lifted if needed. * Improve the documentation for the MacroAssembler. Documents the usage of three-arg form and the intention around the high-level interface. * Format comments in remaining modules * Clean up Cargo.toml for winch pieces This commit adds missing fields to each of Winch's Cargo.toml. * Use `ModuleTranslation::get_types()` to derive the function type * Assert that start range is always word-size aligned |
2 years ago |