You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

379 lines
12 KiB

Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory's guard region afterwards. Only the first memory needed this extra guard. I've attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
use anyhow::Result;
use rayon::prelude::*;
use wasmtime::*;
fn module(engine: &Engine) -> Result<Module> {
let mut wat = format!("(module\n");
wat.push_str("(import \"\" \"\" (memory 0))\n");
for i in 0..=33 {
let offset = if i == 0 {
0
} else if i == 33 {
!0
} else {
1u32 << (i - 1)
};
for (width, instr) in [
(1, &["i32.load8_s"][..]),
(2, &["i32.load16_s"]),
(4, &["i32.load" /*, "f32.load"*/]),
(8, &["i64.load" /*, "f64.load"*/]),
#[cfg(not(target_arch = "s390x"))]
Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory&#39;s guard region afterwards. Only the first memory needed this extra guard. I&#39;ve attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
(16, &["v128.load"]),
]
.iter()
{
for (j, instr) in instr.iter().enumerate() {
wat.push_str(&format!(
"(func (export \"{} {} v{}\") (param i32)\n",
width, offset, j
));
wat.push_str("local.get 0\n");
wat.push_str(instr);
wat.push_str(&format!(" offset={}\n", offset));
wat.push_str("drop\n)");
}
}
}
wat.push_str(")");
Module::new(engine, &wat)
}
struct TestFunc {
width: u32,
offset: u32,
func: TypedFunc<u32, ()>,
}
fn find_funcs(store: &mut Store<()>, instance: &Instance) -> Vec<TestFunc> {
let list = instance
.exports(&mut *store)
.map(|export| {
let name = export.name();
let mut parts = name.split_whitespace();
(
parts.next().unwrap().parse().unwrap(),
parts.next().unwrap().parse().unwrap(),
export.into_func().unwrap(),
)
})
.collect::<Vec<_>>();
list.into_iter()
.map(|(width, offset, func)| TestFunc {
width,
offset,
func: func.typed(&store).unwrap(),
})
.collect()
}
fn test_traps(store: &mut Store<()>, funcs: &[TestFunc], addr: u32, mem: &Memory) {
let mem_size = mem.data_size(&store) as u64;
for func in funcs {
let result = func.func.call(&mut *store, addr);
let base = u64::from(func.offset) + u64::from(addr);
let range = base..base + u64::from(func.width);
if range.start >= mem_size || range.end >= mem_size {
Implement the memory64 proposal in Wasmtime (#3153) * Implement the memory64 proposal in Wasmtime This commit implements the WebAssembly [memory64 proposal][proposal] in both Wasmtime and Cranelift. In terms of work done Cranelift ended up needing very little work here since most of it was already prepared for 64-bit memories at one point or another. Most of the work in Wasmtime is largely refactoring, changing a bunch of `u32` values to something else. A number of internal and public interfaces are changing as a result of this commit, for example: * Acessors on `wasmtime::Memory` that work with pages now all return `u64` unconditionally rather than `u32`. This makes it possible to accommodate 64-bit memories with this API, but we may also want to consider `usize` here at some point since the host can&#39;t grow past `usize`-limited pages anyway. * The `wasmtime::Limits` structure is removed in favor of minimum/maximum methods on table/memory types. * Many libcall intrinsics called by jit code now unconditionally take `u64` arguments instead of `u32`. Return values are `usize`, however, since the return value, if successful, is always bounded by host memory while arguments can come from any guest. * The `heap_addr` clif instruction now takes a 64-bit offset argument instead of a 32-bit one. It turns out that the legalization of `heap_addr` already worked with 64-bit offsets, so this change was fairly trivial to make. * The runtime implementation of mmap-based linear memories has changed to largely work in `usize` quantities in its API and in bytes instead of pages. This simplifies various aspects and reflects that mmap-memories are always bound by `usize` since that&#39;s what the host is using to address things, and additionally most calculations care about bytes rather than pages except for the very edge where we&#39;re going to/from wasm. Overall I&#39;ve tried to minimize the amount of `as` casts as possible, using checked `try_from` and checked arithemtic with either error handling or explicit `unwrap()` calls to tell us about bugs in the future. Most locations have relatively obvious things to do with various implications on various hosts, and I think they should all be roughly of the right shape but time will tell. I mostly relied on the compiler complaining that various types weren&#39;t aligned to figure out type-casting, and I manually audited some of the more obvious locations. I suspect we have a number of hidden locations that will panic on 32-bit hosts if 64-bit modules try to run there, but otherwise I think we should be generally ok (famous last words). In any case I wouldn&#39;t want to enable this by default naturally until we&#39;ve fuzzed it for some time. In terms of the actual underlying implementation, no one should expect memory64 to be all that fast. Right now it&#39;s implemented with &#34;dynamic&#34; heaps which have a few consequences: * All memory accesses are bounds-checked. I&#39;m not sure how aggressively Cranelift tries to optimize out bounds checks, but I suspect not a ton since we haven&#39;t stressed this much historically. * Heaps are always precisely sized. This means that every call to `memory.grow` will incur a `memcpy` of memory from the old heap to the new. We probably want to at least look into `mremap` on Linux and otherwise try to implement schemes where dynamic heaps have some reserved pages to grow into to help amortize the cost of `memory.grow`. The memory64 spec test suite is scheduled to now run on CI, but as with all the other spec test suites it&#39;s really not all that comprehensive. I&#39;ve tried adding more tests for basic things as I&#39;ve had to implement guards for them, but I wouldn&#39;t really consider the testing adequate from just this PR itself. I did try to take care in one test to actually allocate a 4gb+ heap and then avoid running that in the pooling allocator or in emulation because otherwise that may fail or take excessively long. [proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md * Fix some tests * More test fixes * Fix wasmtime tests * Fix doctests * Revert to 32-bit immediate offsets in `heap_addr` This commit updates the generation of addresses in wasm code to always use 32-bit offsets for `heap_addr`, and if the calculated offset is bigger than 32-bits we emit a manual add with an overflow check. * Disable memory64 for spectest fuzzing * Fix wrong offset being added to heap addr * More comments! * Clarify bytes/pages
3 years ago
assert!(
result.is_err(),
"access at {}+{}+{} succeeded but should have failed when memory has {} bytes",
addr,
func.offset,
func.width,
mem_size
);
Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory&#39;s guard region afterwards. Only the first memory needed this extra guard. I&#39;ve attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
} else {
assert!(result.is_ok());
}
}
}
#[test]
fn offsets_static_dynamic_oh_my() -> Result<()> {
const GB: u64 = 1 << 30;
let mut engines = Vec::new();
let sizes = [0, 1 * GB, 4 * GB];
for &static_memory_maximum_size in sizes.iter() {
for &guard_size in sizes.iter() {
for &guard_before_linear_memory in [true, false].iter() {
let mut config = Config::new();
config.wasm_simd(true);
config.static_memory_maximum_size(static_memory_maximum_size);
config.dynamic_memory_guard_size(guard_size);
config.static_memory_guard_size(guard_size);
config.guard_before_linear_memory(guard_before_linear_memory);
Implement the memory64 proposal in Wasmtime (#3153) * Implement the memory64 proposal in Wasmtime This commit implements the WebAssembly [memory64 proposal][proposal] in both Wasmtime and Cranelift. In terms of work done Cranelift ended up needing very little work here since most of it was already prepared for 64-bit memories at one point or another. Most of the work in Wasmtime is largely refactoring, changing a bunch of `u32` values to something else. A number of internal and public interfaces are changing as a result of this commit, for example: * Acessors on `wasmtime::Memory` that work with pages now all return `u64` unconditionally rather than `u32`. This makes it possible to accommodate 64-bit memories with this API, but we may also want to consider `usize` here at some point since the host can&#39;t grow past `usize`-limited pages anyway. * The `wasmtime::Limits` structure is removed in favor of minimum/maximum methods on table/memory types. * Many libcall intrinsics called by jit code now unconditionally take `u64` arguments instead of `u32`. Return values are `usize`, however, since the return value, if successful, is always bounded by host memory while arguments can come from any guest. * The `heap_addr` clif instruction now takes a 64-bit offset argument instead of a 32-bit one. It turns out that the legalization of `heap_addr` already worked with 64-bit offsets, so this change was fairly trivial to make. * The runtime implementation of mmap-based linear memories has changed to largely work in `usize` quantities in its API and in bytes instead of pages. This simplifies various aspects and reflects that mmap-memories are always bound by `usize` since that&#39;s what the host is using to address things, and additionally most calculations care about bytes rather than pages except for the very edge where we&#39;re going to/from wasm. Overall I&#39;ve tried to minimize the amount of `as` casts as possible, using checked `try_from` and checked arithemtic with either error handling or explicit `unwrap()` calls to tell us about bugs in the future. Most locations have relatively obvious things to do with various implications on various hosts, and I think they should all be roughly of the right shape but time will tell. I mostly relied on the compiler complaining that various types weren&#39;t aligned to figure out type-casting, and I manually audited some of the more obvious locations. I suspect we have a number of hidden locations that will panic on 32-bit hosts if 64-bit modules try to run there, but otherwise I think we should be generally ok (famous last words). In any case I wouldn&#39;t want to enable this by default naturally until we&#39;ve fuzzed it for some time. In terms of the actual underlying implementation, no one should expect memory64 to be all that fast. Right now it&#39;s implemented with &#34;dynamic&#34; heaps which have a few consequences: * All memory accesses are bounds-checked. I&#39;m not sure how aggressively Cranelift tries to optimize out bounds checks, but I suspect not a ton since we haven&#39;t stressed this much historically. * Heaps are always precisely sized. This means that every call to `memory.grow` will incur a `memcpy` of memory from the old heap to the new. We probably want to at least look into `mremap` on Linux and otherwise try to implement schemes where dynamic heaps have some reserved pages to grow into to help amortize the cost of `memory.grow`. The memory64 spec test suite is scheduled to now run on CI, but as with all the other spec test suites it&#39;s really not all that comprehensive. I&#39;ve tried adding more tests for basic things as I&#39;ve had to implement guards for them, but I wouldn&#39;t really consider the testing adequate from just this PR itself. I did try to take care in one test to actually allocate a 4gb+ heap and then avoid running that in the pooling allocator or in emulation because otherwise that may fail or take excessively long. [proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md * Fix some tests * More test fixes * Fix wasmtime tests * Fix doctests * Revert to 32-bit immediate offsets in `heap_addr` This commit updates the generation of addresses in wasm code to always use 32-bit offsets for `heap_addr`, and if the calculated offset is bigger than 32-bits we emit a manual add with an overflow check. * Disable memory64 for spectest fuzzing * Fix wrong offset being added to heap addr * More comments! * Clarify bytes/pages
3 years ago
config.cranelift_debug_verifier(true);
Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory&#39;s guard region afterwards. Only the first memory needed this extra guard. I&#39;ve attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
engines.push(Engine::new(&config)?);
}
}
}
engines.par_iter().for_each(|engine| {
let module = module(&engine).unwrap();
Implement the memory64 proposal in Wasmtime (#3153) * Implement the memory64 proposal in Wasmtime This commit implements the WebAssembly [memory64 proposal][proposal] in both Wasmtime and Cranelift. In terms of work done Cranelift ended up needing very little work here since most of it was already prepared for 64-bit memories at one point or another. Most of the work in Wasmtime is largely refactoring, changing a bunch of `u32` values to something else. A number of internal and public interfaces are changing as a result of this commit, for example: * Acessors on `wasmtime::Memory` that work with pages now all return `u64` unconditionally rather than `u32`. This makes it possible to accommodate 64-bit memories with this API, but we may also want to consider `usize` here at some point since the host can&#39;t grow past `usize`-limited pages anyway. * The `wasmtime::Limits` structure is removed in favor of minimum/maximum methods on table/memory types. * Many libcall intrinsics called by jit code now unconditionally take `u64` arguments instead of `u32`. Return values are `usize`, however, since the return value, if successful, is always bounded by host memory while arguments can come from any guest. * The `heap_addr` clif instruction now takes a 64-bit offset argument instead of a 32-bit one. It turns out that the legalization of `heap_addr` already worked with 64-bit offsets, so this change was fairly trivial to make. * The runtime implementation of mmap-based linear memories has changed to largely work in `usize` quantities in its API and in bytes instead of pages. This simplifies various aspects and reflects that mmap-memories are always bound by `usize` since that&#39;s what the host is using to address things, and additionally most calculations care about bytes rather than pages except for the very edge where we&#39;re going to/from wasm. Overall I&#39;ve tried to minimize the amount of `as` casts as possible, using checked `try_from` and checked arithemtic with either error handling or explicit `unwrap()` calls to tell us about bugs in the future. Most locations have relatively obvious things to do with various implications on various hosts, and I think they should all be roughly of the right shape but time will tell. I mostly relied on the compiler complaining that various types weren&#39;t aligned to figure out type-casting, and I manually audited some of the more obvious locations. I suspect we have a number of hidden locations that will panic on 32-bit hosts if 64-bit modules try to run there, but otherwise I think we should be generally ok (famous last words). In any case I wouldn&#39;t want to enable this by default naturally until we&#39;ve fuzzed it for some time. In terms of the actual underlying implementation, no one should expect memory64 to be all that fast. Right now it&#39;s implemented with &#34;dynamic&#34; heaps which have a few consequences: * All memory accesses are bounds-checked. I&#39;m not sure how aggressively Cranelift tries to optimize out bounds checks, but I suspect not a ton since we haven&#39;t stressed this much historically. * Heaps are always precisely sized. This means that every call to `memory.grow` will incur a `memcpy` of memory from the old heap to the new. We probably want to at least look into `mremap` on Linux and otherwise try to implement schemes where dynamic heaps have some reserved pages to grow into to help amortize the cost of `memory.grow`. The memory64 spec test suite is scheduled to now run on CI, but as with all the other spec test suites it&#39;s really not all that comprehensive. I&#39;ve tried adding more tests for basic things as I&#39;ve had to implement guards for them, but I wouldn&#39;t really consider the testing adequate from just this PR itself. I did try to take care in one test to actually allocate a 4gb+ heap and then avoid running that in the pooling allocator or in emulation because otherwise that may fail or take excessively long. [proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md * Fix some tests * More test fixes * Fix wasmtime tests * Fix doctests * Revert to 32-bit immediate offsets in `heap_addr` This commit updates the generation of addresses in wasm code to always use 32-bit offsets for `heap_addr`, and if the calculated offset is bigger than 32-bits we emit a manual add with an overflow check. * Disable memory64 for spectest fuzzing * Fix wrong offset being added to heap addr * More comments! * Clarify bytes/pages
3 years ago
for (min, max) in [(1, Some(2)), (1, None)].iter() {
Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory&#39;s guard region afterwards. Only the first memory needed this extra guard. I&#39;ve attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
let mut store = Store::new(&engine, ());
Implement the memory64 proposal in Wasmtime (#3153) * Implement the memory64 proposal in Wasmtime This commit implements the WebAssembly [memory64 proposal][proposal] in both Wasmtime and Cranelift. In terms of work done Cranelift ended up needing very little work here since most of it was already prepared for 64-bit memories at one point or another. Most of the work in Wasmtime is largely refactoring, changing a bunch of `u32` values to something else. A number of internal and public interfaces are changing as a result of this commit, for example: * Acessors on `wasmtime::Memory` that work with pages now all return `u64` unconditionally rather than `u32`. This makes it possible to accommodate 64-bit memories with this API, but we may also want to consider `usize` here at some point since the host can&#39;t grow past `usize`-limited pages anyway. * The `wasmtime::Limits` structure is removed in favor of minimum/maximum methods on table/memory types. * Many libcall intrinsics called by jit code now unconditionally take `u64` arguments instead of `u32`. Return values are `usize`, however, since the return value, if successful, is always bounded by host memory while arguments can come from any guest. * The `heap_addr` clif instruction now takes a 64-bit offset argument instead of a 32-bit one. It turns out that the legalization of `heap_addr` already worked with 64-bit offsets, so this change was fairly trivial to make. * The runtime implementation of mmap-based linear memories has changed to largely work in `usize` quantities in its API and in bytes instead of pages. This simplifies various aspects and reflects that mmap-memories are always bound by `usize` since that&#39;s what the host is using to address things, and additionally most calculations care about bytes rather than pages except for the very edge where we&#39;re going to/from wasm. Overall I&#39;ve tried to minimize the amount of `as` casts as possible, using checked `try_from` and checked arithemtic with either error handling or explicit `unwrap()` calls to tell us about bugs in the future. Most locations have relatively obvious things to do with various implications on various hosts, and I think they should all be roughly of the right shape but time will tell. I mostly relied on the compiler complaining that various types weren&#39;t aligned to figure out type-casting, and I manually audited some of the more obvious locations. I suspect we have a number of hidden locations that will panic on 32-bit hosts if 64-bit modules try to run there, but otherwise I think we should be generally ok (famous last words). In any case I wouldn&#39;t want to enable this by default naturally until we&#39;ve fuzzed it for some time. In terms of the actual underlying implementation, no one should expect memory64 to be all that fast. Right now it&#39;s implemented with &#34;dynamic&#34; heaps which have a few consequences: * All memory accesses are bounds-checked. I&#39;m not sure how aggressively Cranelift tries to optimize out bounds checks, but I suspect not a ton since we haven&#39;t stressed this much historically. * Heaps are always precisely sized. This means that every call to `memory.grow` will incur a `memcpy` of memory from the old heap to the new. We probably want to at least look into `mremap` on Linux and otherwise try to implement schemes where dynamic heaps have some reserved pages to grow into to help amortize the cost of `memory.grow`. The memory64 spec test suite is scheduled to now run on CI, but as with all the other spec test suites it&#39;s really not all that comprehensive. I&#39;ve tried adding more tests for basic things as I&#39;ve had to implement guards for them, but I wouldn&#39;t really consider the testing adequate from just this PR itself. I did try to take care in one test to actually allocate a 4gb+ heap and then avoid running that in the pooling allocator or in emulation because otherwise that may fail or take excessively long. [proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md * Fix some tests * More test fixes * Fix wasmtime tests * Fix doctests * Revert to 32-bit immediate offsets in `heap_addr` This commit updates the generation of addresses in wasm code to always use 32-bit offsets for `heap_addr`, and if the calculated offset is bigger than 32-bits we emit a manual add with an overflow check. * Disable memory64 for spectest fuzzing * Fix wrong offset being added to heap addr * More comments! * Clarify bytes/pages
3 years ago
let mem = Memory::new(&mut store, MemoryType::new(*min, *max)).unwrap();
Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory&#39;s guard region afterwards. Only the first memory needed this extra guard. I&#39;ve attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
let instance = Instance::new(&mut store, &module, &[mem.into()]).unwrap();
let funcs = find_funcs(&mut store, &instance);
test_traps(&mut store, &funcs, 0, &mem);
test_traps(&mut store, &funcs, 65536, &mem);
test_traps(&mut store, &funcs, u32::MAX, &mem);
mem.grow(&mut store, 1).unwrap();
test_traps(&mut store, &funcs, 0, &mem);
test_traps(&mut store, &funcs, 65536, &mem);
test_traps(&mut store, &funcs, u32::MAX, &mem);
}
});
Ok(())
}
#[test]
fn guards_present() -> Result<()> {
const GUARD_SIZE: u64 = 65536;
let mut config = Config::new();
config.static_memory_maximum_size(1 << 20);
config.dynamic_memory_guard_size(GUARD_SIZE);
config.static_memory_guard_size(GUARD_SIZE);
config.guard_before_linear_memory(true);
let engine = Engine::new(&config)?;
let mut store = Store::new(&engine, ());
Implement the memory64 proposal in Wasmtime (#3153) * Implement the memory64 proposal in Wasmtime This commit implements the WebAssembly [memory64 proposal][proposal] in both Wasmtime and Cranelift. In terms of work done Cranelift ended up needing very little work here since most of it was already prepared for 64-bit memories at one point or another. Most of the work in Wasmtime is largely refactoring, changing a bunch of `u32` values to something else. A number of internal and public interfaces are changing as a result of this commit, for example: * Acessors on `wasmtime::Memory` that work with pages now all return `u64` unconditionally rather than `u32`. This makes it possible to accommodate 64-bit memories with this API, but we may also want to consider `usize` here at some point since the host can&#39;t grow past `usize`-limited pages anyway. * The `wasmtime::Limits` structure is removed in favor of minimum/maximum methods on table/memory types. * Many libcall intrinsics called by jit code now unconditionally take `u64` arguments instead of `u32`. Return values are `usize`, however, since the return value, if successful, is always bounded by host memory while arguments can come from any guest. * The `heap_addr` clif instruction now takes a 64-bit offset argument instead of a 32-bit one. It turns out that the legalization of `heap_addr` already worked with 64-bit offsets, so this change was fairly trivial to make. * The runtime implementation of mmap-based linear memories has changed to largely work in `usize` quantities in its API and in bytes instead of pages. This simplifies various aspects and reflects that mmap-memories are always bound by `usize` since that&#39;s what the host is using to address things, and additionally most calculations care about bytes rather than pages except for the very edge where we&#39;re going to/from wasm. Overall I&#39;ve tried to minimize the amount of `as` casts as possible, using checked `try_from` and checked arithemtic with either error handling or explicit `unwrap()` calls to tell us about bugs in the future. Most locations have relatively obvious things to do with various implications on various hosts, and I think they should all be roughly of the right shape but time will tell. I mostly relied on the compiler complaining that various types weren&#39;t aligned to figure out type-casting, and I manually audited some of the more obvious locations. I suspect we have a number of hidden locations that will panic on 32-bit hosts if 64-bit modules try to run there, but otherwise I think we should be generally ok (famous last words). In any case I wouldn&#39;t want to enable this by default naturally until we&#39;ve fuzzed it for some time. In terms of the actual underlying implementation, no one should expect memory64 to be all that fast. Right now it&#39;s implemented with &#34;dynamic&#34; heaps which have a few consequences: * All memory accesses are bounds-checked. I&#39;m not sure how aggressively Cranelift tries to optimize out bounds checks, but I suspect not a ton since we haven&#39;t stressed this much historically. * Heaps are always precisely sized. This means that every call to `memory.grow` will incur a `memcpy` of memory from the old heap to the new. We probably want to at least look into `mremap` on Linux and otherwise try to implement schemes where dynamic heaps have some reserved pages to grow into to help amortize the cost of `memory.grow`. The memory64 spec test suite is scheduled to now run on CI, but as with all the other spec test suites it&#39;s really not all that comprehensive. I&#39;ve tried adding more tests for basic things as I&#39;ve had to implement guards for them, but I wouldn&#39;t really consider the testing adequate from just this PR itself. I did try to take care in one test to actually allocate a 4gb+ heap and then avoid running that in the pooling allocator or in emulation because otherwise that may fail or take excessively long. [proposal]: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md * Fix some tests * More test fixes * Fix wasmtime tests * Fix doctests * Revert to 32-bit immediate offsets in `heap_addr` This commit updates the generation of addresses in wasm code to always use 32-bit offsets for `heap_addr`, and if the calculated offset is bigger than 32-bits we emit a manual add with an overflow check. * Disable memory64 for spectest fuzzing * Fix wrong offset being added to heap addr * More comments! * Clarify bytes/pages
3 years ago
let static_mem = Memory::new(&mut store, MemoryType::new(1, Some(2)))?;
let dynamic_mem = Memory::new(&mut store, MemoryType::new(1, None))?;
Add guard pages to the front of linear memories (#2977) * Add guard pages to the front of linear memories This commit implements a safety feature for Wasmtime to place guard pages before the allocation of all linear memories. Guard pages placed after linear memories are typically present for performance (at least) because it can help elide bounds checks. Guard pages before a linear memory, however, are never strictly needed for performance or features. The intention of a preceding guard page is to help insulate against bugs in Cranelift or other code generators, such as CVE-2021-32629. This commit adds a `Config::guard_before_linear_memory` configuration option, defaulting to `true`, which indicates whether guard pages should be present both before linear memories as well as afterwards. Guard regions continue to be controlled by `{static,dynamic}_memory_guard_size` methods. The implementation here affects both on-demand allocated memories as well as the pooling allocator for memories. For on-demand memories this adjusts the size of the allocation as well as adjusts the calculations for the base pointer of the wasm memory. For the pooling allocator this will place a singular extra guard region at the very start of the allocation for memories. Since linear memories in the pooling allocator are contiguous every memory already had a preceding guard region in memory, it was just the previous memory&#39;s guard region afterwards. Only the first memory needed this extra guard. I&#39;ve attempted to write some tests to help test all this, but this is all somewhat tricky to test because the settings are pretty far away from the actual behavior. I think, though, that the tests added here should help cover various use cases and help us have confidence in tweaking the various `Config` settings beyond their defaults. Note that this also contains a semantic change where `InstanceLimits::memory_reservation_size` has been removed. Instead this field is now inferred from the `static_memory_maximum_size` and guard size settings. This should hopefully remove some duplication in these settings, canonicalizing on the guard-size/static-size settings as the way to control memory sizes and virtual reservations. * Update config docs * Fix a typo * Fix benchmark * Fix wasmtime-runtime tests * Fix some more tests * Try to fix uffd failing test * Review items * Tweak 32-bit defaults Makes the pooling allocator a bit more reasonable by default on 32-bit with these settings.
3 years ago
let assert_guards = |store: &Store<()>| unsafe {
// guards before
println!("check pre-static-mem");
assert_faults(static_mem.data_ptr(&store).offset(-(GUARD_SIZE as isize)));
println!("check pre-dynamic-mem");
assert_faults(dynamic_mem.data_ptr(&store).offset(-(GUARD_SIZE as isize)));
// guards after
println!("check post-static-mem");
assert_faults(
static_mem
.data_ptr(&store)
.add(static_mem.data_size(&store)),
);
println!("check post-dynamic-mem");
assert_faults(
dynamic_mem
.data_ptr(&store)
.add(dynamic_mem.data_size(&store)),
);
};
assert_guards(&store);
// static memory should start with the second page unmapped
unsafe {
assert_faults(static_mem.data_ptr(&store).add(65536));
}
println!("growing");
static_mem.grow(&mut store, 1).unwrap();
dynamic_mem.grow(&mut store, 1).unwrap();
assert_guards(&store);
Ok(())
}
#[test]
fn guards_present_pooling() -> Result<()> {
const GUARD_SIZE: u64 = 65536;
let mut config = Config::new();
config.static_memory_maximum_size(1 << 20);
config.dynamic_memory_guard_size(GUARD_SIZE);
config.static_memory_guard_size(GUARD_SIZE);
config.guard_before_linear_memory(true);
config.allocation_strategy(InstanceAllocationStrategy::Pooling {
strategy: PoolingAllocationStrategy::default(),
module_limits: ModuleLimits {
memory_pages: 10,
..ModuleLimits::default()
},
instance_limits: InstanceLimits { count: 2 },
});
let engine = Engine::new(&config)?;
let mut store = Store::new(&engine, ());
let mem1 = {
let m = Module::new(&engine, "(module (memory (export \"\") 1 2))")?;
Instance::new(&mut store, &m, &[])?
.get_memory(&mut store, "")
.unwrap()
};
let mem2 = {
let m = Module::new(&engine, "(module (memory (export \"\") 1))")?;
Instance::new(&mut store, &m, &[])?
.get_memory(&mut store, "")
.unwrap()
};
unsafe fn assert_guards(store: &Store<()>, mem: &Memory) {
// guards before
println!("check pre-mem");
assert_faults(mem.data_ptr(&store).offset(-(GUARD_SIZE as isize)));
// unmapped just after memory
println!("check mem");
assert_faults(mem.data_ptr(&store).add(mem.data_size(&store)));
// guards after memory
println!("check post-mem");
assert_faults(mem.data_ptr(&store).add(1 << 20));
}
unsafe {
assert_guards(&store, &mem1);
assert_guards(&store, &mem2);
println!("growing");
mem1.grow(&mut store, 1).unwrap();
mem2.grow(&mut store, 1).unwrap();
assert_guards(&store, &mem1);
assert_guards(&store, &mem2);
}
Ok(())
}
unsafe fn assert_faults(ptr: *mut u8) {
use std::io::Error;
#[cfg(unix)]
{
// I think things get real weird with uffd since there's a helper thread
// that's not cloned with `fork` below. Just skip this test for uffd
// since it's covered by tests elsewhere.
if cfg!(target_os = "linux") && cfg!(feature = "uffd") {
return;
}
// There's probably a faster way to do this here, but, uh, when in rome?
match libc::fork() {
0 => {
*ptr = 4;
std::process::exit(0);
}
-1 => panic!("failed to fork: {}", Error::last_os_error()),
n => {
let mut status = 0;
assert!(
libc::waitpid(n, &mut status, 0) == n,
"failed to wait: {}",
Error::last_os_error()
);
assert!(libc::WIFSIGNALED(status));
}
}
}
#[cfg(windows)]
{
use winapi::um::memoryapi::*;
use winapi::um::winnt::*;
let mut info = std::mem::MaybeUninit::uninit();
let r = VirtualQuery(
ptr as *const _,
info.as_mut_ptr(),
std::mem::size_of_val(&info),
);
if r == 0 {
panic!("failed to VirtualAlloc: {}", Error::last_os_error());
}
let info = info.assume_init();
assert_eq!(info.AllocationProtect, PAGE_NOACCESS);
}
}
#[test]
fn massive_64_bit_still_limited() -> Result<()> {
// Creating a 64-bit memory which exceeds the limits of the address space
// should still send a request to the `ResourceLimiter` to ensure that it
// gets at least some chance to see that oom was requested.
let mut config = Config::new();
config.wasm_memory64(true);
let engine = Engine::new(&config)?;
let mut store = Store::new(&engine, MyLimiter { hit: false });
store.limiter(|x| x);
let ty = MemoryType::new64(1 << 48, None);
assert!(Memory::new(&mut store, ty).is_err());
assert!(store.data().hit);
return Ok(());
struct MyLimiter {
hit: bool,
}
impl ResourceLimiter for MyLimiter {
fn memory_growing(
&mut self,
_current: usize,
_request: usize,
_max: Option<usize>,
) -> bool {
self.hit = true;
true
}
fn table_growing(&mut self, _current: u32, _request: u32, _max: Option<u32>) -> bool {
unreachable!()
}
}
}
#[test]
fn tiny_static_heap() -> Result<()> {
// The size of the memory in the module below is the exact same size as
// the static memory size limit in the configuration. This is intended to
// specifically test that a load of all the valid addresses of the memory
// all pass bounds-checks in cranelift to help weed out any off-by-one bugs.
let mut config = Config::new();
config.static_memory_maximum_size(65536);
let engine = Engine::new(&config)?;
let mut store = Store::new(&engine, ());
let module = Module::new(
&engine,
r#"
(module
(memory 1 1)
(func (export "run")
(local $i i32)
(loop
(if (i32.eq (local.get $i) (i32.const 65536))
(return))
(drop (i32.load8_u (local.get $i)))
(local.set $i (i32.add (local.get $i) (i32.const 1)))
br 0
)
)
)
"#,
)?;
let i = Instance::new(&mut store, &module, &[])?;
let f = i.get_typed_func::<(), (), _>(&mut store, "run")?;
f.call(&mut store, ())?;
Ok(())
}
#[test]
fn static_forced_max() -> Result<()> {
let mut config = Config::new();
config.static_memory_maximum_size(5 * 65536);
config.static_memory_forced(true);
let engine = Engine::new(&config)?;
let mut store = Store::new(&engine, ());
let mem = Memory::new(&mut store, MemoryType::new(0, None))?;
mem.grow(&mut store, 5).unwrap();
assert!(mem.grow(&mut store, 1).is_err());
Ok(())
}