You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
61 lines
2.6 KiB
61 lines
2.6 KiB
From 1ec3de1634195a4d4410cc33fdc66c68057e16a3 Mon Sep 17 00:00:00 2001
|
|
From: Chris Fallin <chris@cfallin.org>
|
|
Date: Sat, 5 Feb 2022 22:45:58 -0800
|
|
Subject: [PATCH] Emulate Linux madvise() properly when possible.
|
|
|
|
Curently madvise() is not emulated for Linux targets because it is not
|
|
trivial to emulate when the guest and host page sizes differ -- in this
|
|
case, mmap()s are not passed straight through, so the semantics of
|
|
various MADV_* flags are not trivial to replicate.
|
|
|
|
However, if the guest and host are both Linux, and the page sizes are
|
|
the same on both ends (which is often the case, e.g. 4KiB for x86-64,
|
|
aarch64, s390x, and possibly others), then the mmap()s are in fact
|
|
passed straight through. Furthermore, the MADV_* flags are defined in
|
|
target-independent headers, so we can pass the base, length, and
|
|
`advice` arugments to `madvise()` straight through.
|
|
|
|
This patch alters the Linux-userspace syscall emulation to do just that,
|
|
passing through the `madvise()` calls when possible and returning
|
|
`EINVAL` otherwise so the guest is properly informed that the desired
|
|
semantics (e.g., MADV_DONTNEED to clear memory) are not available.
|
|
---
|
|
linux-user/syscall.c | 22 ++++++++++++++++------
|
|
1 file changed, 16 insertions(+), 6 deletions(-)
|
|
|
|
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
|
|
index 5950222a77..836e39df5f 100644
|
|
--- a/linux-user/syscall.c
|
|
+++ b/linux-user/syscall.c
|
|
@@ -11853,12 +11853,22 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
|
|
|
|
#ifdef TARGET_NR_madvise
|
|
case TARGET_NR_madvise:
|
|
- /* A straight passthrough may not be safe because qemu sometimes
|
|
- turns private file-backed mappings into anonymous mappings.
|
|
- This will break MADV_DONTNEED.
|
|
- This is a hint, so ignoring and returning success is ok. */
|
|
- return 0;
|
|
-#endif
|
|
+#ifdef __linux__
|
|
+ /* If the host is Linux, and the guest and host page sizes are the
|
|
+ * same, then mmaps will have been passed through one-to-one, so we can
|
|
+ * rely on the madvise semantics of the host. Note that the advice
|
|
+ * argument (arg3) is fully architecture-independent. */
|
|
+ if (TARGET_PAGE_SIZE == sysconf(_SC_PAGESIZE)) {
|
|
+ return get_errno(madvise(g2h_untagged(arg1), (size_t)arg2, (int)arg3));
|
|
+ } else {
|
|
+ return -TARGET_EINVAL;
|
|
+ }
|
|
+#else // __linux__
|
|
+ /* We will not be able to emulate the Linux-specific semantics, so we
|
|
+ * raise an error. */
|
|
+ return -TARGET_EINVAL;
|
|
+#endif // !__linux__
|
|
+#endif // TARGET_NR_madvise
|
|
#ifdef TARGET_NR_fcntl64
|
|
case TARGET_NR_fcntl64:
|
|
{
|
|
--
|
|
2.34.1
|
|
|
|
|