This uses MP_REGISTER_ROOT_POINTER() to register repl_line
instead of using a conditional inside of mp_state_vm_t.
Signed-off-by: David Lechner <david@pybricks.com>
This contains a string useful for identifying the underlying machine. This
string is kept consistent with the second part of the REPL banner via the
new config option MICROPY_BANNER_MACHINE.
This makes os.uname() more or less redundant, as all the information in
os.uname() is now available in the sys module.
Signed-off-by: Damien George <damien@micropython.org>
This commit adds the git hash and build date to sys.version. This is
allowed according to CPython docs, and is what PyPy does. The docs state:
A string containing the version number of the Python interpreter plus
additional information on the build number and compiler used.
Eg on CPython:
Python 3.10.4 (main, Mar 23 2022, 23:05:40) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.version
'3.10.4 (main, Mar 23 2022, 23:05:40) [GCC 11.2.0]'
and PyPy:
Python 2.7.12 (5.6.0+dfsg-4, Nov 20 2016, 10:43:30)
[PyPy 5.6.0 with GCC 6.2.0 20161109] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys
>>>> sys.version
'2.7.12 (5.6.0+dfsg-4, Nov 20 2016, 10:43:30)\n[PyPy 5.6.0 with GCC ...
With this commit on MicroPython we now have:
MicroPython v1.18-371-g9d08eb024 on 2022-04-28; linux [GCC 11.2.0] v...
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import sys
>>> sys.version
'3.4.0; MicroPython v1.18-371-g9d08eb024 on 2022-04-28'
Note that the start of the banner is the same as the end of sys.version.
This helps to keep code size under control because the string can be reused
by the compiler.
Signed-off-by: Damien George <damien@micropython.org>
Background: .mpy files are precompiled .py files, built using mpy-cross,
that contain compiled bytecode functions (and can also contain machine
code). The benefit of using an .mpy file over a .py file is that they are
faster to import and take less memory when importing. They are also
smaller on disk.
But the real benefit of .mpy files comes when they are frozen into the
firmware. This is done by loading the .mpy file during compilation of the
firmware and turning it into a set of big C data structures (the job of
mpy-tool.py), which are then compiled and downloaded into the ROM of a
device. These C data structures can be executed in-place, ie directly from
ROM. This makes importing even faster because there is very little to do,
and also means such frozen modules take up much less RAM (because their
bytecode stays in ROM).
The downside of frozen code is that it requires recompiling and reflashing
the entire firmware. This can be a big barrier to entry, slows down
development time, and makes it harder to do OTA updates of frozen code
(because the whole firmware must be updated).
This commit attempts to solve this problem by providing a solution that
sits between loading .mpy files into RAM and freezing them into the
firmware. The .mpy file format has been reworked so that it consists of
data and bytecode which is mostly static and ready to run in-place. If
these new .mpy files are located in flash/ROM which is memory addressable,
the .mpy file can be executed (mostly) in-place.
With this approach there is still a small amount of unpacking and linking
of the .mpy file that needs to be done when it's imported, but it's still
much better than loading an .mpy from disk into RAM (although not as good
as freezing .mpy files into the firmware).
The main trick to make static .mpy files is to adjust the bytecode so any
qstrs that it references now go through a lookup table to convert from
local qstr number in the module to global qstr number in the firmware.
That means the bytecode does not need linking/rewriting of qstrs when it's
loaded. Instead only a small qstr table needs to be built (and put in RAM)
at import time. This means the bytecode itself is static/constant and can
be used directly if it's in addressable memory. Also the qstr string data
in the .mpy file, and some constant object data, can be used directly.
Note that the qstr table is global to the module (ie not per function).
In more detail, in the VM what used to be (schematically):
qst = DECODE_QSTR_VALUE;
is now (schematically):
idx = DECODE_QSTR_INDEX;
qst = qstr_table[idx];
That allows the bytecode to be fixed at compile time and not need
relinking/rewriting of the qstr values. Only qstr_table needs to be linked
when the .mpy is loaded.
Incidentally, this helps to reduce the size of bytecode because what used
to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices.
If the module uses the same qstr more than two times then the bytecode is
smaller than before.
The following changes are measured for this commit compared to the
previous (the baseline):
- average 7%-9% reduction in size of .mpy files
- frozen code size is reduced by about 5%-7%
- importing .py files uses about 5% less RAM in total
- importing .mpy files uses about 4% less RAM in total
- importing .py and .mpy files takes about the same time as before
The qstr indirection in the bytecode has only a small impact on VM
performance. For stm32 on PYBv1.0 the performance change of this commit
is:
diff of scores (higher is better)
N=100 M=100 baseline -> this-commit diff diff% (error%)
bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%)
bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%)
bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%)
bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%)
bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%)
bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%)
bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%)
core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%)
core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%)
core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%)
core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%)
misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%)
misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%)
misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%)
misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%)
viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%)
viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%)
viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%)
viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%)
viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%)
viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%)
And for unix on x64:
diff of scores (higher is better)
N=2000 M=2000 baseline -> this-commit diff diff% (error%)
bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%)
bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%)
bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%)
bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%)
bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%)
bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%)
bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%)
misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%)
misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%)
misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%)
misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%)
The code size change is (firmware with a lot of frozen code benefits the
most):
bare-arm: +396 +0.697%
minimal x86: +1595 +0.979% [incl +32(data)]
unix x64: +2408 +0.470% [incl +800(data)]
unix nanbox: +1396 +0.309% [incl -96(data)]
stm32: -1256 -0.318% PYBV10
cc3200: +288 +0.157%
esp8266: -260 -0.037% GENERIC
esp32: -216 -0.014% GENERIC[incl -1072(data)]
nrf: +116 +0.067% pca10040
rp2: -664 -0.135% PICO
samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS
As part of this change the .mpy file format version is bumped to version 6.
And mpy-tool.py has been improved to provide a good visualisation of the
contents of .mpy files.
In summary: this commit changes the bytecode to use qstr indirection, and
reworks the .mpy file format to be simpler and allow .mpy files to be
executed in-place. Performance is not impacted too much. Eventually it
will be possible to store such .mpy files in a linear, read-only, memory-
mappable filesystem so they can be executed from flash/ROM. This will
essentially be able to replace frozen code for most applications.
Signed-off-by: Damien George <damien@micropython.org>
This changes makemanifest.py & mpy-tool.py to merge string and mpy names
into the same list (now mp_frozen_names).
The various paths for loading a frozen module (mp_find_frozen_module) and
checking existence of a frozen module (mp_frozen_stat) use a common
function that searches this list.
In addition, the frozen lookup will now only take place if the path starts
with ".frozen", which needs to be added to sys.path.
This fixes issues #1804, #2322, #3509, #6419.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This commit moves all first-party code developed for this project from lib/
to shared/, so that lib/ now only contains third-party code.
The following directories are moved as-is from lib to shared:
lib/libc -> shared/libc
lib/memzip -> shared/memzip
lib/netutils -> shared/netutils
lib/timeutils -> shared/timeutils
lib/upytesthelper -> shared/upytesthelper
All files in lib/embed/ have been moved to shared/libc/.
lib/mp-readline has been moved to shared/readline.
lib/utils has been moved to shared/runtime, with the exception of
lib/utils/printf.c which has been moved to shared/libc/printf.c.
Signed-off-by: Damien George <damien@micropython.org>
This commit makes gc_lock_depth have one counter per thread, instead of one
global counter. This makes threads properly independent with respect to
the GC, in particular threads can now independently lock the GC for
themselves without locking it for other threads. It also means a given
thread can run a hard IRQ without temporarily locking the GC for all other
threads and potentially making them have MemoryError exceptions at random
locations (this really only occurs on MCUs with multiple cores and no GIL,
eg on the rp2 port).
The commit also removes protection of the GC lock/unlock functions, which
is no longer needed when the counter is per thread (and this also fixes the
cas where a hard IRQ calling gc_lock() may stall waiting for the mutex).
It also puts the check for `gc_lock_depth > 0` outside the GC mutex in
gc_alloc, gc_realloc and gc_free, to potentially prevent a hard IRQ from
waiting on a mutex if it does attempt to allocate heap memory (and putting
the check outside the GC mutex is now safe now that there is a
gc_lock_depth per thread).
Signed-off-by: Damien George <damien@micropython.org>
This was added a long time ago in 75abee206d
when USB host support was added to the stm (now stm32) port, and when this
pyexec code was actually part of the stm port. It's unlikely to work as
intended anymore. If it is needed in the future then generic hook macros
can be added in pyexec.
Background: the friendly/normal REPL is intended for human use whereas the
raw REPL is for computer use/automation. Raw REPL is used for things like
pyboard.py script_to_run.py. The normal REPL has built-in flow control
because it echos back the characters. That's not so with raw REPL and flow
control is just implemented by rate limiting the amount of data that goes
in. Currently it's fixed at 256 byte chunks every 10ms. This is sometimes
too fast for slow MCUs or systems with small stdin buffers. It's also too
slow for a lot of higher-end MCUs, ie it could be a lot faster.
This commit adds a new raw REPL mode which includes flow control: the
device will echo back a character after a certain number of bytes are sent
to the host, and the host can use this to regulate the data going out to
the device. The amount of characters is controlled by the device and sent
to the host before communication starts. This flow control allows getting
the maximum speed out of a serial link, regardless of the link or the
device at the other end.
Also, this new raw REPL mode parses and compiles the incoming data as it
comes in. It does this by creating a "stdin reader" object which is then
passed to the lexer. The lexer requests bytes from this "stdin reader"
which retrieves bytes from the host, and does flow control. What this
means is that no memory is used to store the script (in the existing raw
REPL mode the device needs a big buffer to read in the script before it can
pass it on to the lexer/parser/compiler). The only memory needed on the
device is enough to parse and compile.
Finally, it would be possible to extend this new raw REPL to allow bytecode
(.mpy files) to be sent as well as text mode scripts (but that's not done
in this commit).
Some results follow. The test was to send a large 33k script that contains
mostly comments and then prints out the heap, run via pyboard.py large.py.
On PYBD-SF6, prior to this PR:
$ ./pyboard.py large.py
stack: 524 out of 23552
GC: total: 392192, used: 34464, free: 357728
No. of 1-blocks: 12, 2-blocks: 2, max blk sz: 2075, max free sz: 22345
GC memory layout; from 2001a3f0:
00000: h=hhhh=======================================hhBShShh==h=======h
00400: =====hh=B........h==h===========================================
00800: ================================================================
00c00: ================================================================
01000: ================================================================
01400: ================================================================
01800: ================================================================
01c00: ================================================================
02000: ================================================================
02400: ================================================================
02800: ================================================================
02c00: ================================================================
03000: ================================================================
03400: ================================================================
03800: ================================================================
03c00: ================================================================
04000: ================================================================
04400: ================================================================
04800: ================================================================
04c00: ================================================================
05000: ================================================================
05400: ================================================================
05800: ================================================================
05c00: ================================================================
06000: ================================================================
06400: ================================================================
06800: ================================================================
06c00: ================================================================
07000: ================================================================
07400: ================================================================
07800: ================================================================
07c00: ================================================================
08000: ================================================================
08400: ===============================================.....h==.........
(349 lines all free)
(the big blob of used memory is the large script).
Same but with this PR:
$ ./pyboard.py large.py
stack: 524 out of 23552
GC: total: 392192, used: 1296, free: 390896
No. of 1-blocks: 12, 2-blocks: 3, max blk sz: 40, max free sz: 24420
GC memory layout; from 2001a3f0:
00000: h=hhhh=======================================hhBShShh==h=======h
00400: =====hh=h=B......h==.....h==....................................
(381 lines all free)
The only thing in RAM is the compiled script (and some other unrelated
items).
Time to download before this PR: 1438ms, data rate: 230,799 bits/sec.
Time to download with this PR: 119ms, data rate: 2,788,991 bits/sec.
So it's more than 10 times faster, and uses significantly less RAM.
Results are similar on other boards. On an stm32 board that connects via
UART only at 115200 baud, the data rate goes from 80kbit/sec to
113kbit/sec, so gets close to saturating the UART link without loss of
data.
The new raw REPL mode also supports a single ctrl-C to break out of this
flow-control mode, so that a ctrl-C can always get back to a known state.
It's also backwards compatible with the original raw REPL mode, which is
still supported with the same sequence of commands. The new raw REPL
mode is activated by ctrl-E, which gives an error on devices that do not
support the new mode.
Signed-off-by: Damien George <damien@micropython.org>
Note: the uncrustify configuration is explicitly set to 'add' instead of
'force' in order not to alter the comments which use extra spaces after //
as a means of indenting text for clarity.
Pending exceptions would otherwise be handled later on where there may not
be an NLR handler in place.
A similar fix is also made to the unix port's REPL handler.
Fixes issues #4921 and #5488.
For the 3 ports that already make use of this feature (stm32, nrf and
teensy) this doesn't make any difference, it just allows to disable it from
now on.
For other ports that use pyexec, this decreases code size because the debug
printing code is dead (it can't be enabled) but the compiler can't deduce
that, so code is still emitted.
mp_compile no longer takes an emit_opt argument, rather this setting is now
provided by the global default_emit_opt variable.
Now, when -X emit=native is passed as a command-line option, the emitter
will be set for all compiled modules (included imports), not just the
top-level script.
In the future there could be a way to also set this variable from a script.
Fixes issue #4267.
So that boot.py and/or main.py can be frozen (either as STR or MPY) in the
same way that other scripts are frozen. Frozen scripts have preference to
scripts in the VFS.
Replaces "PYB: soft reboot" with "MPY: soft reboot", etc.
Having a consistent prefix across ports reduces the difference between
ports, which is a general goal. And this change won't break pyboard.py
because that tool only looks for "soft reboot".
Otherwise there is really nothing that can be done, it can't be unlocked by
the user because there is no way to allocate memory to execute the unlock.
See issue #4205 and #4209.
Header files that are considered internal to the py core and should not
normally be included directly are:
py/nlr.h - internal nlr configuration and declarations
py/bc0.h - contains bytecode macro definitions
py/runtime0.h - contains basic runtime enums
Instead, the top-level header files to include are one of:
py/obj.h - includes runtime0.h and defines everything to use the
mp_obj_t type
py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
and defines everything to use the general runtime support functions
Additional, specific headers (eg py/objlist.h) can be included if needed.
This happens with some compilers on some architectures, which don't define
size_t as unsigned int. MicroPython's printf() dooesn't support obscure
format specifiers for size_t, so the obvious choice is to explicitly cast
to unsigned, to match %u used in printf().
Now there is just one function to allocate a new vstr, namely vstr_new
(in addition to vstr_init etc). The caller of this function should know
what initial size to allocate for the buffer, or at least have some policy
or config option, instead of leaving it to a default (as it was before).
"Forced exit" is treated as soft-reboot (Ctrl+D). But expected effect of
calling sys.exit() is termination of the current script, not any further
and more serious actions like mentioned soft reboot.
A port which uses lib/utils/pyexec.c but which does not enable garbage
collection should not need to implement the gc_collect function.
This patch also moves the gc_collect call to after printing the qstr
info. Since qstrs cannot be collected it should not make any difference
to the printed statistics.
The config variable MICROPY_MODULE_FROZEN is now made of two separate
parts: MICROPY_MODULE_FROZEN_STR and MICROPY_MODULE_FROZEN_MPY. This
allows to have none, either or both of frozen strings and frozen mpy
files (aka frozen bytecode).
Before this change, if REPL blocked executing some code, it was possible
to still input new statememts and excuting them, all leading to weird,
and portentially dangerous interaction.
TODO: Current implementation may have issues processing input accumulated
while REPL was blocked.
This is a convenience function similar to pyexec_file. It should be used
instead of raw mp_parse_compile_execute because the latter does not catch
and report exceptions.
py/mphal.h contains declarations for generic mp_hal_XXX functions, such
as stdio and delay/ticks, which ports should provide definitions for. A
port will also provide mphalport.h with further HAL declarations.