mpy-cross uses MICROPY_DYNAMIC_COMPILER and MICROPY_EMIT_NATIVE but does
not actually need to execute native functions, and does not need
mp_fun_table. This commit makes it so mp_fun_table and all its entries are
not built when MICROPY_DYNAMIC_COMPILER is enabled, significantly reducing
the size of the mpy-cross executable and allowing it to be built on more
machines/OS's.
So these constant objects can be loaded by dereferencing the REG_FUN_TABLE
pointer instead of loading immediate values. This reduces the size of
generated native code (when such constants are used), and means that
pointers to these constants are no longer stored in the assembly code.
This commit adds first class support for yield and yield-from in the native
emitter, including send and throw support, and yields enclosed in exception
handlers (which requires pulling down the NLR stack before yielding, then
rebuilding it when resuming).
This has been fully tested and is working on unix x86 and x86-64, and
stm32. Also basic tests have been done with the esp8266 port. Performance
of existing native code is unchanged.
This commit makes viper functions have the same signature as native
functions, at the level of the emitter/assembler. This means that viper
functions can now be wrapped in the same uPy object as native functions.
Viper functions are now responsible for parsing their arguments (before it
was done by the runtime), and this makes calling them more efficient (in
most cases) because the viper entry code can be custom generated to suit
the signature of the function.
This change also opens the way forward for viper functions to take
arbitrary numbers of arguments, and for them to handle globals correctly,
among other things.
Prior to this commit a function compiled with the native decorator
@micropython.native would not work correctly when accessing global
variables, because the globals dict was not being set upon function entry.
This commit fixes this problem by, upon function entry, setting as the
current globals dict the globals dict context the function was defined
within, as per normal Python semantics, and as bytecode does. Upon
function exit the original globals dict is restored.
In order to restore the globals dict when an exception is raised the native
function must guard its internals with an nlr_push/nlr_pop pair. Because
this push/pop is relatively expensive, in both C stack usage for the
nlr_buf_t and CPU execution time, the implementation here optimises things
as much as possible. First, the compiler keeps track of whether a function
even needs to access global variables. Using this information the native
emitter then generates three different kinds of code:
1. no globals used, no exception handlers: no nlr handling code and no
setting of the globals dict.
2. globals used, no exception handlers: an nlr_buf_t is allocated on the
C stack but it is not used if the globals dict is unchanged, saving
execution time because nlr_push/nlr_pop don't need to run.
3. function has exception handlers, may use globals: an nlr_buf_t is
allocated and nlr_push/nlr_pop are always called.
In the end, native functions that don't access globals and don't have
exception handlers will run more efficiently than those that do.
Fixes issue #1573.
Header files that are considered internal to the py core and should not
normally be included directly are:
py/nlr.h - internal nlr configuration and declarations
py/bc0.h - contains bytecode macro definitions
py/runtime0.h - contains basic runtime enums
Instead, the top-level header files to include are one of:
py/obj.h - includes runtime0.h and defines everything to use the
mp_obj_t type
py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
and defines everything to use the general runtime support functions
Additional, specific headers (eg py/objlist.h) can be included if needed.
This patch allows the following code to run without allocating on the heap:
super().foo(...)
Before this patch such a call would allocate a super object on the heap and
then load the foo method and call it right away. The super object is only
needed to perform the lookup of the method and not needed after that. This
patch makes an optimisation to allocate the super object on the C stack and
discard it right after use.
Changes in code size due to this patch are:
bare-arm: +128
minimal: +232
unix x64: +416
unix nanbox: +364
stmhal: +184
esp8266: +340
cc3200: +128
This allows you to pass a number (being an address) to a viper function
that expects a pointer, and also allows casting of integers to pointers
within viper functions.
This was actually the original behaviour, but it regressed due to native
type identifiers being promoted to 4 bits in width.
Before this patch, the native types for uint and ptr/ptr8/ptr16/ptr32
all overlapped and it was possible to make a mistake in casting. Now,
these types are all separate and any coding mistakes will be raised
as runtime errors.
Previous to this patch each time a bytes object was referenced a new
instance (with the same data) was created. With this patch a single
bytes object is created in the compiler and is loaded directly at execute
time as a true constant (similar to loading bignum and float objects).
This saves on allocating RAM and means that bytes objects can now be
used when the memory manager is locked (eg in interrupts).
The MP_BC_LOAD_CONST_BYTES bytecode was removed as part of this.
Generated bytecode is slightly larger due to storing a pointer to the
bytes object instead of the qstr identifier.
Code size is reduced by about 60 bytes on Thumb2 architectures.
This patch gets full function argument passing working with native
emitter. Includes named args, keyword args, default args, var args
and var keyword args. Fully Python compliant.
It reuses the bytecode mp_setup_code_state function to do all the hard
work. This function is slightly adjusted to accommodate native calls,
and the native emitter is forced a bit to emit similar prelude and
code-info as bytecode.
Previous to this patch, a big-int, float or imag constant was interned
(made into a qstr) and then parsed at runtime to create an object each
time it was needed. This is wasteful in RAM and not efficient. Now,
these constants are parsed straight away in the parser and turned into
objects. This allows constants with large numbers of digits (so
addresses issue #1103) and takes us a step closer to #722.
Viper can now do the following:
def store(p:ptr8, c:int):
p[0] = c
This does a store of c to the memory pointed to by p using a machine
instructions inline in the code.
This way, the native glue code is only compiled if native code is
enabled (which makes complete sense; thanks to Paul Sokolovsky for
the idea).
Should fix issue #834.