micropython

Commit Graph

Author	SHA1	Message	Date
Damien George	1762990579	py/bc: Provide separate code-state setup funcs for bytecode and native. mpy-cross will now generate native code based on the size of mp_code_state_native_t, and the runtime will use this struct to calculate the offset of the .state field. This makes native code generation and execution (which rely on this struct) independent to the settings MICROPY_STACKLESS and MICROPY_PY_SYS_SETTRACE, both of which change the size of the mp_code_state_t struct. Fixes issue #5059. Signed-off-by: Damien George <damien@micropython.org>	3 years ago
Jim Mussared	0e7bfc88c6	all: Use mp_obj_malloc everywhere it's applicable. This replaces occurences of foo_t foo = m_new_obj(foo_t); foo->base.type = &foo_type; with foo_t foo = mp_obj_malloc(foo_t, &foo_type); Excludes any places where base is a sub-field or when new0/memset is used. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>	3 years ago
Damien George	f2040bfc7e	py: Rework bytecode and .mpy file format to be mostly static data. Background: .mpy files are precompiled .py files, built using mpy-cross, that contain compiled bytecode functions (and can also contain machine code). The benefit of using an .mpy file over a .py file is that they are faster to import and take less memory when importing. They are also smaller on disk. But the real benefit of .mpy files comes when they are frozen into the firmware. This is done by loading the .mpy file during compilation of the firmware and turning it into a set of big C data structures (the job of mpy-tool.py), which are then compiled and downloaded into the ROM of a device. These C data structures can be executed in-place, ie directly from ROM. This makes importing even faster because there is very little to do, and also means such frozen modules take up much less RAM (because their bytecode stays in ROM). The downside of frozen code is that it requires recompiling and reflashing the entire firmware. This can be a big barrier to entry, slows down development time, and makes it harder to do OTA updates of frozen code (because the whole firmware must be updated). This commit attempts to solve this problem by providing a solution that sits between loading .mpy files into RAM and freezing them into the firmware. The .mpy file format has been reworked so that it consists of data and bytecode which is mostly static and ready to run in-place. If these new .mpy files are located in flash/ROM which is memory addressable, the .mpy file can be executed (mostly) in-place. With this approach there is still a small amount of unpacking and linking of the .mpy file that needs to be done when it's imported, but it's still much better than loading an .mpy from disk into RAM (although not as good as freezing .mpy files into the firmware). The main trick to make static .mpy files is to adjust the bytecode so any qstrs that it references now go through a lookup table to convert from local qstr number in the module to global qstr number in the firmware. That means the bytecode does not need linking/rewriting of qstrs when it's loaded. Instead only a small qstr table needs to be built (and put in RAM) at import time. This means the bytecode itself is static/constant and can be used directly if it's in addressable memory. Also the qstr string data in the .mpy file, and some constant object data, can be used directly. Note that the qstr table is global to the module (ie not per function). In more detail, in the VM what used to be (schematically): qst = DECODE_QSTR_VALUE; is now (schematically): idx = DECODE_QSTR_INDEX; qst = qstr_table[idx]; That allows the bytecode to be fixed at compile time and not need relinking/rewriting of the qstr values. Only qstr_table needs to be linked when the .mpy is loaded. Incidentally, this helps to reduce the size of bytecode because what used to be 2-byte qstr values in the bytecode are now (mostly) 1-byte indices. If the module uses the same qstr more than two times then the bytecode is smaller than before. The following changes are measured for this commit compared to the previous (the baseline): - average 7%-9% reduction in size of .mpy files - frozen code size is reduced by about 5%-7% - importing .py files uses about 5% less RAM in total - importing .mpy files uses about 4% less RAM in total - importing .py and .mpy files takes about the same time as before The qstr indirection in the bytecode has only a small impact on VM performance. For stm32 on PYBv1.0 the performance change of this commit is: diff of scores (higher is better) N=100 M=100 baseline -> this-commit diff diff% (error%) bm_chaos.py 371.07 -> 357.39 : -13.68 = -3.687% (+/-0.02%) bm_fannkuch.py 78.72 -> 77.49 : -1.23 = -1.563% (+/-0.01%) bm_fft.py 2591.73 -> 2539.28 : -52.45 = -2.024% (+/-0.00%) bm_float.py 6034.93 -> 5908.30 : -126.63 = -2.098% (+/-0.01%) bm_hexiom.py 48.96 -> 47.93 : -1.03 = -2.104% (+/-0.00%) bm_nqueens.py 4510.63 -> 4459.94 : -50.69 = -1.124% (+/-0.00%) bm_pidigits.py 650.28 -> 644.96 : -5.32 = -0.818% (+/-0.23%) core_import_mpy_multi.py 564.77 -> 581.49 : +16.72 = +2.960% (+/-0.01%) core_import_mpy_single.py 68.67 -> 67.16 : -1.51 = -2.199% (+/-0.01%) core_qstr.py 64.16 -> 64.12 : -0.04 = -0.062% (+/-0.00%) core_yield_from.py 362.58 -> 354.50 : -8.08 = -2.228% (+/-0.00%) misc_aes.py 429.69 -> 405.59 : -24.10 = -5.609% (+/-0.01%) misc_mandel.py 3485.13 -> 3416.51 : -68.62 = -1.969% (+/-0.00%) misc_pystone.py 2496.53 -> 2405.56 : -90.97 = -3.644% (+/-0.01%) misc_raytrace.py 381.47 -> 374.01 : -7.46 = -1.956% (+/-0.01%) viper_call0.py 576.73 -> 572.49 : -4.24 = -0.735% (+/-0.04%) viper_call1a.py 550.37 -> 546.21 : -4.16 = -0.756% (+/-0.09%) viper_call1b.py 438.23 -> 435.68 : -2.55 = -0.582% (+/-0.06%) viper_call1c.py 442.84 -> 440.04 : -2.80 = -0.632% (+/-0.08%) viper_call2a.py 536.31 -> 532.35 : -3.96 = -0.738% (+/-0.06%) viper_call2b.py 382.34 -> 377.07 : -5.27 = -1.378% (+/-0.03%) And for unix on x64: diff of scores (higher is better) N=2000 M=2000 baseline -> this-commit diff diff% (error%) bm_chaos.py 13594.20 -> 13073.84 : -520.36 = -3.828% (+/-5.44%) bm_fannkuch.py 60.63 -> 59.58 : -1.05 = -1.732% (+/-3.01%) bm_fft.py 112009.15 -> 111603.32 : -405.83 = -0.362% (+/-4.03%) bm_float.py 246202.55 -> 247923.81 : +1721.26 = +0.699% (+/-2.79%) bm_hexiom.py 615.65 -> 617.21 : +1.56 = +0.253% (+/-1.64%) bm_nqueens.py 215807.95 -> 215600.96 : -206.99 = -0.096% (+/-3.52%) bm_pidigits.py 8246.74 -> 8422.82 : +176.08 = +2.135% (+/-3.64%) misc_aes.py 16133.00 -> 16452.74 : +319.74 = +1.982% (+/-1.50%) misc_mandel.py 128146.69 -> 130796.43 : +2649.74 = +2.068% (+/-3.18%) misc_pystone.py 83811.49 -> 83124.85 : -686.64 = -0.819% (+/-1.03%) misc_raytrace.py 21688.02 -> 21385.10 : -302.92 = -1.397% (+/-3.20%) The code size change is (firmware with a lot of frozen code benefits the most): bare-arm: +396 +0.697% minimal x86: +1595 +0.979% [incl +32(data)] unix x64: +2408 +0.470% [incl +800(data)] unix nanbox: +1396 +0.309% [incl -96(data)] stm32: -1256 -0.318% PYBV10 cc3200: +288 +0.157% esp8266: -260 -0.037% GENERIC esp32: -216 -0.014% GENERIC[incl -1072(data)] nrf: +116 +0.067% pca10040 rp2: -664 -0.135% PICO samd: +844 +0.607% ADAFRUIT_ITSYBITSY_M4_EXPRESS As part of this change the .mpy file format version is bumped to version 6. And mpy-tool.py has been improved to provide a good visualisation of the contents of .mpy files. In summary: this commit changes the bytecode to use qstr indirection, and reworks the .mpy file format to be simpler and allow .mpy files to be executed in-place. Performance is not impacted too much. Eventually it will be possible to store such .mpy files in a linear, read-only, memory- mappable filesystem so they can be executed from flash/ROM. This will essentially be able to replace frozen code for most applications. Signed-off-by: Damien George <damien@micropython.org>	3 years ago
stijn	a9448c0a86	all: Fix MICROPY_OBJ_REPR_D compilation with msvc.	3 years ago
Damien George	cfd08448a1	py: Mark unused arguments from bytecode decoding macros. Signed-off-by: Damien George <damien@micropython.org>	3 years ago
Damien George	925bd67cfb	py/objfun: Support fun.__globals__ attribute. This returns a reference to the globals dict associated with the function, ie the global scope that the function was defined in. This attribute is read-only but the dict itself is modifiable, per CPython behaviour. Signed-off-by: Damien George <damien@micropython.org>	4 years ago
Damien George	332d83343f	py: Rework mp_convert_member_lookup to properly handle built-ins. This commit fixes lookups of class members to make it so that built-in functions that are used as methods/functions of a class work correctly. The mp_convert_member_lookup() function is pretty much completely changed by this commit, but for the most part it's just reorganised and the indenting changed. The functional changes are: - staticmethod and classmethod checks moved to later in the if-logic, because they are less common and so should be checked after the more common cases. - The explicit mp_obj_is_type(member, &mp_type_type) check is removed because it's now subsumed by other, more general tests in this function. - MP_TYPE_FLAG_BINDS_SELF and MP_TYPE_FLAG_BUILTIN_FUN type flags added to make the checks in this function much simpler (now they just test this bit in type->flags). - An extra check is made for mp_obj_is_instance_type(type) to fix lookup of built-in functions. Fixes #1326 and #6198. Signed-off-by: Damien George <damien@micropython.org>	4 years ago
David Lechner	ecd7826316	tools/codeformat.py: Remove sizeof fixup. Formatting for `* sizeof` was fixed in uncrustify v0.71, so we no longer need the fixups for it. Also, there was one file where the updated uncrustify caught a problem that the regex didn't pick up, which is updated in this commit. Signed-off-by: David Lechner <david@pybricks.com>	4 years ago
David Lechner	8a4ce6b79a	tools/codeformat.py: Eliminate need for sizeof fixup. This eliminates the need for the sizeof regex fixup by rearranging things a bit. All other bitfields already use the parentheses around expressions with sizeof, so one case is fixed by following this convention. VM_MAX_STATE_ON_STACK is the only remaining problem and it can be worked around by changing the order of the operands.	5 years ago
Damien George	69661f3343	all: Reformat C and Python source code with tools/codeformat.py. This is run with uncrustify 0.70.1, and black 19.10b0.	5 years ago
Damien George	a1b18b3ba7	py: Removing dangling "else" to improve code format consistency.	5 years ago
Damien George	bfbd94401d	py: Make mp_obj_get_type() return a const ptr to mp_obj_type_t. Most types are in rodata/ROM, and mp_obj_base_t.type is a constant pointer, so enforce this const-ness throughout the code base. If a type ever needs to be modified (eg a user type) then a simple cast can be used.	5 years ago
Damien George	c8c0fd4ca3	py: Rework and compress second part of bytecode prelude. This patch compresses the second part of the bytecode prelude which contains the source file name, function name, source-line-number mapping and cell closure information. This part of the prelude now begins with a single varible length unsigned integer which encodes 2 numbers, being the byte-size of the following 2 sections in the header: the "source info section" and the "closure section". After decoding this variable unsigned integer it's possible to skip over one or both of these sections very easily. This scheme saves about 2 bytes for most functions compared to the original format: one in the case that there are no closure cells, and one because padding was eliminated.	5 years ago
Damien George	b5ebfadbd6	py: Compress first part of bytecode prelude. The start of the bytecode prelude contains 6 numbers telling the amount of stack needed for the Python values and exceptions, and the signature of the function. Prior to this patch these numbers were all encoded one after the other (2x variable unsigned integers, then 4x bytes), but using so many bytes is unnecessary. An entropy analysis of around 150,000 bytecode functions from the CPython standard library showed that the optimal Shannon coding would need about 7.1 bits on average to encode these 6 numbers, compared to the existing 48 bits. This patch attempts to get close to this optimal value by packing the 6 numbers into a single, varible-length unsigned integer via bit-wise interleaving. The interleaving scheme is chosen to minimise the average number of bytes needed, and at the same time keep the scheme simple enough so it can be implemented without too much overhead in code size or speed. The scheme requires about 10.5 bits on average to store the 6 numbers. As a result most functions which originally took 6 bytes to encode these 6 numbers now need only 1 byte (in 80% of cases).	5 years ago
Damien George	81d04a0200	py: Add n_state to mp_code_state_t struct. This value is used often enough that it is better to cache it instead of decode it each time.	5 years ago
Damien George	cd35dd9d9a	py: Allow to pass in read-only buffers to viper and inline-asm funcs. Fixes #4936.	5 years ago
Jun Wu	089c9b71d1	py: remove "if (0)" and "if (false)" branches. Prior to this commit, building the unix port with `DEBUG=1` and `-finstrument-functions` the compilation would fail with an error like "control reaches end of non-void function". This change fixes this by removing the problematic "if (0)" branches. Not all branches affect compilation, but they are all removed for consistency.	6 years ago
Damien George	0e4c24ec08	py/nativeglue: Rename native convert funs to match other native helpers.	6 years ago
Damien George	7bc71f5446	py/objfun: Make fun_data arg of mp_obj_new_fun_asm() a const pointer.	6 years ago
Damien George	eee1e8841a	py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API. These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.	6 years ago
Damien George	afecc124e6	py: Fix location of VM returned exception in invalid opcode and comments The location for a returned exception was changed to state[0] in `d95947b48a`	6 years ago
Damien George	6d19934463	py: Get optional VM stack overflow check compiling and working again. Changes to the layout of the bytecode header meant that this debug code was no longer compiling. This is now fixed and a new compile-time option is introduced, MICROPY_DEBUG_VM_STACK_OVERFLOW, to turn on this feature (which is disabled by default). This option is needed because more than one file needs to cooperate to make this check work.	6 years ago
Damien George	cc2bd63c57	py/emitnative: Implement yield and yield-from in native emitter. This commit adds first class support for yield and yield-from in the native emitter, including send and throw support, and yields enclosed in exception handlers (which requires pulling down the NLR stack before yielding, then rebuilding it when resuming). This has been fully tested and is working on unix x86 and x86-64, and stm32. Also basic tests have been done with the esp8266 port. Performance of existing native code is unchanged.	6 years ago
Damien George	d95947b48a	py/vm: When VM raises exception put exc obj at beginning of func state. Instead of at end of state, n_state - 1. It was originally (way back in v1.0) put at the end of the state because the VM didn't have a pointer to the start. But now that the VM takes a mp_code_state_t pointer it does have a pointer to the start of the state so can put the exception object there. This commit saves about 30 bytes of code on all architectures, and, more importantly, reduces C-stack usage by a couple of words (8 bytes on Thumb2 and 16 bytes on x86-64) for every (non-generator) call of a bytecode function because fun_bc_call no longer needs to remember the n_state variable.	6 years ago
Damien George	43f1848bfa	py: Make viper functions have the same entry signature as native. This commit makes viper functions have the same signature as native functions, at the level of the emitter/assembler. This means that viper functions can now be wrapped in the same uPy object as native functions. Viper functions are now responsible for parsing their arguments (before it was done by the runtime), and this makes calling them more efficient (in most cases) because the viper entry code can be custom generated to suit the signature of the function. This change also opens the way forward for viper functions to take arbitrary numbers of arguments, and for them to handle globals correctly, among other things.	6 years ago
Damien George	9f241ef398	py: Optimise call to mp_arg_check_num by compressing fun signature. With 5 arguments to mp_arg_check_num(), some architectures need to pass values on the stack. So compressing n_args_min, n_args_max, takes_kw into a single word and passing only 3 arguments makes the call more efficient, because almost all calls to this function pass in constant values. Code size is also reduced by a decent amount: bare-arm: -116 minimal x86: -64 unix x64: -256 unix nanbox: -112 stm32: -324 cc3200: -192 esp8266: -192 esp32: -144	6 years ago
Damien George	b630dfcc1d	py: Fix compiling with debug enabled and make more use of DEBUG_printf. DEBUG_printf and MICROPY_DEBUG_PRINTER is now used instead of normal printf, and a fault is fixed in mp_obj_class_lookup with debugging enabled; see issue #3999. Debugging can now be enabled on all ports including when nan-boxing is used.	6 years ago
Damien George	e2e22e3d7e	py/objgenerator: Implement __name__ with normal fun attr accessor code. With the recent change `b488a4a848`, a generating function now has the same layout in memory as a normal bytecode function, and so can reuse the latter's attribute accessor code to implement __name__.	6 years ago
Tom Collins	a883fe12d9	py/objfun: Fix variable name in DECODE_CODESTATE_SIZE() macro. This patch fixes the macro so you can pass any name in, and the macro will make more sense if you're reading it on its own. It worked previously because n_state is always passed in as n_state_out_var.	7 years ago
Damien George	1e5a33df41	py: Convert all uses of alloca() to use new scoped allocation API.	7 years ago
Paul Sokolovsky	2b00181592	py/objfun: Factor out macro for initializing codestate. This is second part of fun_bc_call() vs mp_obj_fun_bc_prepare_codestate() common code refactor. This factors out code to initialize codestate object. After this patch, mp_obj_fun_bc_prepare_codestate() is effectively DECODE_CODESTATE_SIZE() followed by allocation followed by INIT_CODESTATE(), and fun_bc_call() starts with that too.	7 years ago
Paul Sokolovsky	d72370def7	py/objfun, vm: Add comments on codestate allocation in stackless mode.	7 years ago
Paul Sokolovsky	fca1d1aa62	py/objfun: Factor out macro for decoding codestate size. fun_bc_call() starts with almost the same code as mp_obj_fun_bc_prepare_codestate(), the only difference is a way to allocate the codestate object (heap vs stack with heap fallback). Still, would be nice to avoid code duplication to make further refactoring easier. So, this commit factors out the common code before the allocation - decoding and calculating codestate size. It produces two values, so structured as a macro which writes to 2 variables passed as arguments.	7 years ago
Damien George	a3dc1b1957	all: Remove inclusion of internal py header files. Header files that are considered internal to the py core and should not normally be included directly are: py/nlr.h - internal nlr configuration and declarations py/bc0.h - contains bytecode macro definitions py/runtime0.h - contains basic runtime enums Instead, the top-level header files to include are one of: py/obj.h - includes runtime0.h and defines everything to use the mp_obj_t type py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h, and defines everything to use the general runtime support functions Additional, specific headers (eg py/objlist.h) can be included if needed.	7 years ago
Stefan Naumann	ace9fb5405	py: Add verbose debug compile-time flag MICROPY_DEBUG_VERBOSE. It enables all the DEBUG_printf outputs in the py/ source code.	7 years ago
Alexander Steffen	55f33240f3	all: Use the name MicroPython consistently in comments There were several different spellings of MicroPython present in comments, when there should be only one.	7 years ago
Damien George	a8a5d1e8c8	py: Provide mp_decode_uint_skip() to help reduce stack usage. Taking the address of a local variable leads to increased stack usage, so the mp_decode_uint_skip() function is added to reduce the need for taking addresses. The changes in this patch reduce stack usage of a Python call by 8 bytes on ARM Thumb, by 16 bytes on non-windowing Xtensa archs, and by 16 bytes on x86-64. Code size is also slightly reduced on most archs by around 32 bytes.	8 years ago
Damien George	6b34107537	py: Change mp_uint_t to size_t for mp_obj_str_get_data len arg.	8 years ago
Damien George	6213ad7f46	py: Convert mp_uint_t to size_t for tuple/list accessors. This patch changes mp_uint_t to size_t for the len argument of the following public facing C functions: mp_obj_tuple_get mp_obj_list_get mp_obj_get_array These functions take a pointer to the len argument (to be filled in by the function) and callers of these functions should update their code so the type of len is changed to size_t. For ports that don't use nan-boxing there should be no change in generate code because the size of the type remains the same (word sized), and in a lot of cases there won't even be a compiler warning if the type remains as mp_uint_t. The reason for this change is to standardise on the use of size_t for variables that count memory (or memory related) sizes/lengths. It helps builds that use nan-boxing.	8 years ago
Damien George	71a3d6ec3b	py: Reduce size of mp_code_state_t structure. Instead of caching data that is constant (code_info, const_table and n_state), store just a pointer to the underlying function object from which this data can be derived. This helps reduce stack usage for the case when the mp_code_state_t structure is stored on the stack, as well as heap usage when it's stored on the heap. The downside is that the VM becomes a little more complex because it now needs to derive the data from the underlying function object. But this doesn't impact the performance by much (if at all) because most of the decoding of data is done outside the main opcode loop. Measurements using pystone show that little to no performance is lost. This patch also fixes a nasty bug whereby the bytecode can be reclaimed by the GC during execution. With this patch there is always a pointer to the function object held by the VM during execution, since it's stored in the mp_code_state_t structure.	8 years ago
Krzysztof Blazewicz	7e480e8a30	py: Use mp_obj_get_array where sequence may be a tuple or a list.	8 years ago
Damien George	dbcdb9f8d8	py/objfun: Convert mp_uint_t to size_t where appropriate.	8 years ago
Damien George	ad297a1950	py: Allow inline-assembler emitter to be generic. This patch refactors some code so that it is easier to integrate new inline assemblers for different architectures other than ARM Thumb.	8 years ago
Damien George	571e6f26db	py: Specialise builtin funcs to use separate type for fixed arg count. Builtin functions with a fixed number of arguments (0, 1, 2 or 3) are quite common. Before this patch the wrapper for such a function cost 3 machine words. After this patch it only takes 2, which can reduce the code size by quite a bit (and pays off even more, the more functions are added). It also makes function dispatch slightly more efficient in CPU usage, and furthermore reduces stack usage for these cases. On x86 and Thumb archs the dispatch functions are now tail-call optimised by the compiler. The bare-arm port has its code size increase by 76 bytes, but stmhal drops by 904 bytes. Stack usage by these builtin functions is decreased by 48 bytes on Thumb2 archs.	8 years ago
Damien George	0c595fa094	py/objfun: Use if instead of switch to check return value of VM execute. It's simpler and improves code coverage.	8 years ago
Damien George	c71edaed73	py/objfun: Remove unnecessary check for viper fun with 5 or more args. The native emitter/compiler restricts viper functions to 4 args, so there is no need for an extra check in the dynamic dispatch.	8 years ago
Damien George	581a59a456	py: Rename struct mp_code_state to mp_code_state_t. Also at _t to mp_exc_stack pre-declaration in struct typedef.	8 years ago
Damien George	9a58316de2	py/objfun: Allow inline-asm functions to be called with 4 arguments.	9 years ago
Damien George	5f3e005b67	py: Extend native type-sig to use 4 bits, so uint is separate to ptr. Before this patch, the native types for uint and ptr/ptr8/ptr16/ptr32 all overlapped and it was possible to make a mistake in casting. Now, these types are all separate and any coding mistakes will be raised as runtime errors.	9 years ago
Damien George	8f54c08691	py/inlineasm: Add ability to specify return type of asm_thumb funcs. Supported return types are: object, bool, int, uint. For example: @micropython.asm_thumb def foo(r0, r1) -> uint: add(r0, r0, r1)	9 years ago

1 2 3 4

183 Commits (6841fecbb25ca547947e70c5441c2dbb8ec08dd1)