Browse Source
Explicitly yield each time a thread mutex is unlocked. Key to understanding this bug is that Python threads run at equal RTOS priority, and although ESP-IDF FreeRTOS (and I think vanilla FreeRTOS) scheduler will round-robin equal priority tasks in the ready state it does not make a similar guarantee for tasks moving between ready and waiting. The pathological case of this bug is when one Python thread task is busy (i.e. never blocks) it will hog the CPU more than expected, sometimes for an unbounded amount of time. This happens even though it periodically unlocks the GIL to allow another task to run. Assume T1 is busy and T2 is blocked waiting for the GIL. T1 is executing and hits a condition to yield execution: 1. T1 calls MP_THREAD_GIL_EXIT 2. FreeRTOS sees T2 is waiting for the GIL and moves it to the Ready list (but does not preempt, as T2 is same priority, so T1 keeps running). 3. T1 immediately calls MP_THREAD_GIL_ENTER and re-takes the GIL. 4. Pre-emptive context switch happens, T2 wakes up, sees GIL is not available, and goes on the waiting list for the GIL again. To break this cycle step 4 must happen before step 3, but this may be a very narrow window of time so it may not happen regularly - and quantisation of the timing of the tick interrupt to trigger a context switch may mean it never happens. Yielding at the end of step 2 maximises the chance for another task to run. Adds a test that fails on esp32 before this fix and passes afterwards. Fixes issue #15423. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>pull/15476/head
Angus Gratton
4 months ago
committed by
Damien George
3 changed files with 60 additions and 0 deletions
@ -0,0 +1,53 @@ |
|||||
|
# Threads should be semi-cooperative, to the point where one busy |
||||
|
# thread can't starve out another. |
||||
|
# |
||||
|
# (Note on ports without the GIL this one should always be true, on ports with GIL it's |
||||
|
# a test of the GIL behaviour.) |
||||
|
|
||||
|
import _thread |
||||
|
import sys |
||||
|
from time import ticks_ms, ticks_diff, sleep_ms |
||||
|
|
||||
|
|
||||
|
done = False |
||||
|
|
||||
|
ITERATIONS = 5 |
||||
|
SLEEP_MS = 250 |
||||
|
MAX_DELTA = 30 |
||||
|
|
||||
|
if sys.platform in ("win32", "linux", "darwin"): |
||||
|
# Conventional operating systems get looser timing restrictions |
||||
|
SLEEP_MS = 300 |
||||
|
MAX_DELTA = 100 |
||||
|
|
||||
|
|
||||
|
def busy_thread(): |
||||
|
while not done: |
||||
|
pass |
||||
|
|
||||
|
|
||||
|
def test_sleeps(): |
||||
|
global done |
||||
|
ok = True |
||||
|
for _ in range(ITERATIONS): |
||||
|
t0 = ticks_ms() |
||||
|
sleep_ms(SLEEP_MS) |
||||
|
t1 = ticks_ms() |
||||
|
d = ticks_diff(t1, t0) |
||||
|
if d < SLEEP_MS - MAX_DELTA or d > SLEEP_MS + MAX_DELTA: |
||||
|
print("Slept too long ", d) |
||||
|
ok = False |
||||
|
print("OK" if ok else "Not OK") |
||||
|
done = True |
||||
|
|
||||
|
|
||||
|
# make the thread the busy one, and check sleep time on main task |
||||
|
_thread.start_new_thread(busy_thread, ()) |
||||
|
test_sleeps() |
||||
|
|
||||
|
sleep_ms(100) |
||||
|
done = False |
||||
|
|
||||
|
# now swap them |
||||
|
_thread.start_new_thread(test_sleeps, ()) |
||||
|
busy_thread() |
@ -0,0 +1,2 @@ |
|||||
|
OK |
||||
|
OK |
Loading…
Reference in new issue