You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

61 lines
2.6 KiB

From 1ec3de1634195a4d4410cc33fdc66c68057e16a3 Mon Sep 17 00:00:00 2001
From: Chris Fallin <chris@cfallin.org>
Date: Sat, 5 Feb 2022 22:45:58 -0800
Subject: [PATCH] Emulate Linux madvise() properly when possible.
Curently madvise() is not emulated for Linux targets because it is not
trivial to emulate when the guest and host page sizes differ -- in this
case, mmap()s are not passed straight through, so the semantics of
various MADV_* flags are not trivial to replicate.
However, if the guest and host are both Linux, and the page sizes are
the same on both ends (which is often the case, e.g. 4KiB for x86-64,
aarch64, s390x, and possibly others), then the mmap()s are in fact
passed straight through. Furthermore, the MADV_* flags are defined in
target-independent headers, so we can pass the base, length, and
`advice` arugments to `madvise()` straight through.
This patch alters the Linux-userspace syscall emulation to do just that,
passing through the `madvise()` calls when possible and returning
`EINVAL` otherwise so the guest is properly informed that the desired
semantics (e.g., MADV_DONTNEED to clear memory) are not available.
---
linux-user/syscall.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5950222a77..836e39df5f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11853,12 +11853,22 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
#ifdef TARGET_NR_madvise
case TARGET_NR_madvise:
- /* A straight passthrough may not be safe because qemu sometimes
- turns private file-backed mappings into anonymous mappings.
- This will break MADV_DONTNEED.
- This is a hint, so ignoring and returning success is ok. */
- return 0;
-#endif
+#ifdef __linux__
+ /* If the host is Linux, and the guest and host page sizes are the
+ * same, then mmaps will have been passed through one-to-one, so we can
+ * rely on the madvise semantics of the host. Note that the advice
+ * argument (arg3) is fully architecture-independent. */
+ if (TARGET_PAGE_SIZE == sysconf(_SC_PAGESIZE)) {
+ return get_errno(madvise(g2h_untagged(arg1), (size_t)arg2, (int)arg3));
+ } else {
+ return -TARGET_EINVAL;
+ }
+#else // __linux__
+ /* We will not be able to emulate the Linux-specific semantics, so we
+ * raise an error. */
+ return -TARGET_EINVAL;
+#endif // !__linux__
+#endif // TARGET_NR_madvise
#ifdef TARGET_NR_fcntl64
case TARGET_NR_fcntl64:
{
--
2.34.1