This commit removes all parts of code associated with the existing
MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE optimisation option, including the
-mcache-lookup-bc option to mpy-cross.
This feature originally provided a significant performance boost for Unix,
but wasn't able to be enabled for MCU targets (due to frozen bytecode), and
added significant extra complexity to generating and distributing .mpy
files.
The equivalent performance gain is now provided by the combination of
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE (which has
been enabled on the unix port in the previous commit).
It's hard to provide precise performance numbers, but tests have been run
on a wide variety of architectures (x86-64, ARM Cortex, Aarch64, RISC-V,
xtensa) and they all generally agree on the qualitative improvements seen
by the combination of MICROPY_OPT_LOAD_ATTR_FAST_PATH and
MICROPY_OPT_MAP_LOOKUP_CACHE.
For example, on a "quiet" Linux x64 environment (i3-5010U @ 2.10GHz) the
change from CACHE_MAP_LOOKUP_IN_BYTECODE, to LOAD_ATTR_FAST_PATH combined
with MAP_LOOKUP_CACHE is:
diff of scores (higher is better)
N=2000 M=2000 bccache -> attrmapcache diff diff% (error%)
bm_chaos.py 13742.56 -> 13905.67 : +163.11 = +1.187% (+/-3.75%)
bm_fannkuch.py 60.13 -> 61.34 : +1.21 = +2.012% (+/-2.11%)
bm_fft.py 113083.20 -> 114793.68 : +1710.48 = +1.513% (+/-1.57%)
bm_float.py 256552.80 -> 243908.29 : -12644.51 = -4.929% (+/-1.90%)
bm_hexiom.py 521.93 -> 625.41 : +103.48 = +19.826% (+/-0.40%)
bm_nqueens.py 197544.25 -> 217713.12 : +20168.87 = +10.210% (+/-3.01%)
bm_pidigits.py 8072.98 -> 8198.75 : +125.77 = +1.558% (+/-3.22%)
misc_aes.py 17283.45 -> 16480.52 : -802.93 = -4.646% (+/-0.82%)
misc_mandel.py 99083.99 -> 128939.84 : +29855.85 = +30.132% (+/-5.88%)
misc_pystone.py 83860.10 -> 82592.56 : -1267.54 = -1.511% (+/-2.27%)
misc_raytrace.py 21490.40 -> 22227.23 : +736.83 = +3.429% (+/-1.88%)
This shows that the new optimisations are at least as good as the existing
inline-bytecode-caching, and are sometimes much better (because the new
ones apply caching to a wider variety of map lookups).
The new optimisations can also benefit code generated by the native
emitter, because they apply to the runtime rather than the generated code.
The improvement for the native emitter when LOAD_ATTR_FAST_PATH and
MAP_LOOKUP_CACHE are enabled is (same Linux environment as above):
diff of scores (higher is better)
N=2000 M=2000 native -> nat-attrmapcache diff diff% (error%)
bm_chaos.py 14130.62 -> 15464.68 : +1334.06 = +9.441% (+/-7.11%)
bm_fannkuch.py 74.96 -> 76.16 : +1.20 = +1.601% (+/-1.80%)
bm_fft.py 166682.99 -> 168221.86 : +1538.87 = +0.923% (+/-4.20%)
bm_float.py 233415.23 -> 265524.90 : +32109.67 = +13.756% (+/-2.57%)
bm_hexiom.py 628.59 -> 734.17 : +105.58 = +16.796% (+/-1.39%)
bm_nqueens.py 225418.44 -> 232926.45 : +7508.01 = +3.331% (+/-3.10%)
bm_pidigits.py 6322.00 -> 6379.52 : +57.52 = +0.910% (+/-5.62%)
misc_aes.py 20670.10 -> 27223.18 : +6553.08 = +31.703% (+/-1.56%)
misc_mandel.py 138221.11 -> 152014.01 : +13792.90 = +9.979% (+/-2.46%)
misc_pystone.py 85032.14 -> 105681.44 : +20649.30 = +24.284% (+/-2.25%)
misc_raytrace.py 19800.01 -> 23350.73 : +3550.72 = +17.933% (+/-2.79%)
In summary, compared to MICROPY_OPT_CACHE_MAP_LOOKUP_IN_BYTECODE, the new
MICROPY_OPT_LOAD_ATTR_FAST_PATH and MICROPY_OPT_MAP_LOOKUP_CACHE options:
- are simpler;
- take less code size;
- are faster (generally);
- work with code generated by the native emitter;
- can be used on embedded targets with a small and constant RAM overhead;
- allow the same .mpy bytecode to run on all targets.
See #7680 for further discussion. And see also #7653 for a discussion
about simplifying mpy-cross options.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Don't want users to accidentally use boot.py (because recovering requires
knowing how to activate safe mode).
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
To simplify config, there's no need to specify MP_PLAT_PRINT_STRN if it's
the same as the default definition in py/mpconfig.h.
Signed-off-by: Damien George <damien@micropython.org>
* Fix a typo in the Makefile that prevented the debug build to be actually
enabled when BTYPE=debug is used.
* Add a missing header in modmachine.c that is used when a debug build is
created.
This commit improves some FTP implementation details for better
compatibility with FTP clients:
* The PWD command now puts quotes around the directory name before
returning it. This fixes BBEdit’s FTP client, which performs a PWD after
each CWD and gets confused if the returned directory path is not
surrounded by quotes.
* The FEAT command is now allowed before logging in. This fixes the lftp
client, which send FEAT first and gets confused (tries to use TLS) if the
server responds with 332.
To portably get the Epoch. This is simply aliased to localtime() on ports
that are not timezone aware.
Signed-off-by: Damien George <damien@micropython.org>
Updating to Black v20.8b1 there are two changes that affect the code in
this repository:
- If there is a trailing comma in a list (eg [], () or function call) then
that list is now written out with one line per element. So remove such
trailing commas where the list should stay on one line.
- Spaces at the start of """ doc strings are removed.
Signed-off-by: Damien George <damien@micropython.org>
The irq.h file now just provides low-level IRQ definitions and priorities.
All Python binding definitions are moved to modmachine.h, with some
renaming of pyb -> machine, and also the machine_idle definition (was
pyb_wfi) is moved to modmachine.c.
The cc3200 and teensy ports are updated to build with these changes.
Signed-off-by: Damien George <damien@micropython.org>
With only `sp_func_proto_paren = remove` set there are some cases where
uncrustify misses removing a space between the function name and the
opening '('. This sets all of the related options to `force` as well.
No functionality change is intended with this commit, it just consolidates
the separate implementations of GC helper code to the lib/utils/ directory
as a general set of helper functions useful for any port. This reduces
duplication of code, and makes it easier for future ports or embedders to
get the GC implementation correct.
Ports should now link against gchelper_native.c and either gchelper_m0.s or
gchelper_m3.s (currently only Cortex-M is supported but other architectures
can follow), or use the fallback gchelper_generic.c which will work on
x86/x64/ARM.
The gc_helper_get_sp function from gchelper_m3.s is not really GC related
and was only used by cc3200, so it has been moved to that port and renamed
to cortex_m3_get_sp.
Note: the uncrustify configuration is explicitly set to 'add' instead of
'force' in order not to alter the comments which use extra spaces after //
as a means of indenting text for clarity.
This commit makes all functions and function wrappers in modubinascii.c
STATIC and conditional on the MICROPY_PY_UBINASCII setting, which will
exclude the file from qstr/ compressed-string searching when ubinascii is
not enabled. The now-unused modubinascii.h header file is also removed.
The cc3200 port is updated accordingly to use this module in its entirety
instead of providing its own top-level definition of ubinascii.
This was originally like this because the cc3200 port has its own ubinascii
module which referenced these methods. The plan appeared to be that the
API might diverge (e.g. hardware crc), but this should be done similar to
I2C/SPI via a port-specific handler, rather than the port having its own
definition of the module. Having a centralised module definition also
enforces consistency of the API among ports.
This string is recognised by uncrustify, to disable formatting in the
region marked by these comments. This is necessary in the qstrdef*.h files
to prevent modification of the strings within the Q(...). In other places
it is used to prevent excessive reformatting that would make the code less
readable.
This provides a more consistent C-level API to raise exceptions, ie moving
away from nlr_raise towards mp_raise_XXX. It also reduces code size by a
small amount on some ports.
This function is tightly coupled to the state and behaviour of the
scheduler, and is a core part of the runtime: to schedule a pending
exception. So move it there.
This modifies the signature of mp_thread_set_state() to use
mp_state_thread_t* instead of void*. This matches the return type of
mp_thread_get_state(), which returns the same value.
`struct _mp_state_thread_t;` had to be moved before
`#include <mpthreadport.h>` since the stm32 port uses it in its
mpthreadport.h file.
This commit implements automatic module weak links for all built-in
modules, by searching for "ufoo" in the built-in module list if "foo"
cannot be found. This means that all modules named "ufoo" are always
available as "foo". Also, a port can no longer add any other weak links,
which makes strict the definition of a weak link.
It saves some code size (about 100-200 bytes) on ports that previously had
lots of weak links.
Some changes from the previous behaviour:
- It doesn't intern the non-u module names (eg "foo" is not interned),
which saves code size, but will mean that "import foo" creates a new qstr
(namely "foo") in RAM (unless the importing module is frozen).
- help('modules') no longer lists non-u module names, only the u-variants;
this reduces duplication in the help listing.
Weak links are effectively the same as having a set of symbolic links on
the filesystem that is searched last. So an "import foo" will search
built-in modules first, then all paths in sys.path, then weak links last,
importing "ufoo" if it exists. Thus a file called "foo.py" somewhere in
sys.path will still have precedence over the weak link of "foo" to "ufoo".
See issues: #1740, #4449, #5229, #5241.
The stm32 and nrf ports already had the behaviour that they would first
check if the script exists before executing it, and this patch makes all
other ports work the same way. This helps when developing apps because
it's hard to tell (when unconditionally trying to execute the scripts) if
the resulting OSError at boot up comes from missing boot.py or main.py, or
from some other error. And it's not really an error if these scripts don't
exist.
Replaces "PYB: soft reboot" with "MPY: soft reboot", etc.
Having a consistent prefix across ports reduces the difference between
ports, which is a general goal. And this change won't break pyboard.py
because that tool only looks for "soft reboot".
This patch moves the implementation of stream closure from a dedicated
method to the ioctl of the stream protocol, for each type that implements
closing. The benefits of this are:
1. Rounds out the stream ioctl function, which already includes flush,
seek and poll (among other things).
2. Makes calling mp_stream_close() on an object slightly more efficient
because it now no longer needs to lookup the close method and call it,
rather it just delegates straight to the ioctl function (if it exists).
3. Reduces code size and allows future types that implement the stream
protocol to be smaller because they don't need a dedicated close method.
Code size reduction is around 200 bytes smaller for x86 archs and around
30 bytes smaller for the bare-metal archs.
This patch simplifies the str creation API to favour the common case of
creating a str object that is not forced to be interned. To force
interning of a new str the new mp_obj_new_str_via_qstr function is added,
and should only be used if warranted.
Apart from simplifying the mp_obj_new_str function (and making it have the
same signature as mp_obj_new_bytes), this patch also reduces code size by a
bit (-16 bytes for bare-arm and roughly -40 bytes on the bare-metal archs).
Header files that are considered internal to the py core and should not
normally be included directly are:
py/nlr.h - internal nlr configuration and declarations
py/bc0.h - contains bytecode macro definitions
py/runtime0.h - contains basic runtime enums
Instead, the top-level header files to include are one of:
py/obj.h - includes runtime0.h and defines everything to use the
mp_obj_t type
py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
and defines everything to use the general runtime support functions
Additional, specific headers (eg py/objlist.h) can be included if needed.
This is to keep the top-level directory clean, to make it clear what is
core and what is a port, and to allow the repository to grow with new ports
in a sustainable way.