This commit provides a typedef for mp_rom_error_text_t, and a macro define
for MP_COMPRESSED_ROM_TEXT, when MICROPY_ROM_TEXT_COMPRESSION is disabled.
This simplifies the configuration (it no longer has a special case for
MICROPY_ENABLE_DYNRUNTIME) and makes it work for other cases that don't use
compression (eg examples/embedding). This commit also ensures
MICROPY_ROM_TEXT_COMPRESSION is defined during qstr processing.
The idea here is that there's a moderate amount of ROM used up by exception
text. Obviously we try to keep the messages short, and the code can enable
terse errors, but it still adds up. Listed below is the total string data
size for various ports:
bare-arm 2860
minimal 2876
stm32 8926 (PYBV11)
cc3200 3751
esp32 5721
This commit implements compression of these strings. It takes advantage of
the fact that these strings are all 7-bit ascii and extracts the top 128
frequently used words from the messages and stores them packed (dropping
their null-terminator), then uses (0x80 | index) inside strings to refer to
these common words. Spaces are automatically added around words, saving
more bytes. This happens transparently in the build process, mirroring the
steps that are used to generate the QSTR data. The MP_COMPRESSED_ROM_TEXT
macro wraps any literal string that should compressed, and it's
automatically decompressed in mp_decompress_rom_string.
There are many schemes that could be used for the compression, and some are
included in py/makecompresseddata.py for reference (space, Huffman, ngram,
common word). Results showed that the common-word compression gets better
results. This is before counting the increased cost of the Huffman
decoder. This might be slightly counter-intuitive, but this data is
extremely repetitive at a word-level, and the byte-level entropy coder
can't quite exploit that as efficiently. Ideally one would combine both
approaches, but for now the common-word approach is the one that is used.
For additional comparison, the size of the raw data compressed with gzip
and zlib is calculated, as a sort of proxy for a lower entropy bound. With
this scheme we come within 15% on stm32, and 30% on bare-arm (i.e. we use
x% more bytes than the data compressed with gzip -- not counting the code
overhead of a decoder, and how this would be hypothetically implemented).
The feature is disabled by default and can be enabled by setting
MICROPY_ROM_TEXT_COMPRESSION at the Makefile-level.
When loading a manifest file, e.g. by include(), it will chdir first to the
directory of that manifest. This means that all file operations within a
manifest are relative to that manifest's location.
As a consequence of this, additional environment variables are needed to
find absolute paths, so the following are added: $(MPY_LIB_DIR),
$(PORT_DIR), $(BOARD_DIR). And rename $(MPY) to $(MPY_DIR) to be
consistent.
Existing manifests are updated to match.
This introduces a new build variable FROZEN_MANIFEST which can be set to a
manifest listing (written in Python) that describes the set of files to be
frozen in to the firmware.
This ; make Windows compilation fail with GNU makefile 4.2.1. It was added
in 0dc85c9f86 as part of a shell if-
statement, but this if-statement was subsequently removed in
23a693ec2d so the semicolon is not needed.
The variable $(TOUCH) is initialized with the "touch" value in mkenv.mk
like for the other command line tools (rm, echo, cp, mkdir etc). With
this, for example, Windows users can specify the path of touch.exe.
This system makes it a lot easier to include external libraries as static,
native modules in MicroPython. Simply pass USER_C_MODULES (like
FROZEN_MPY_DIR) as a make parameter.
Instead of emitnative.c having configuration code for each supported
architecture, and then compiling this file multiple times with different
macros defined, this patch adds a file per architecture with the necessary
code to configure the native emitter. These files then #include the
emitnative.c file.
This simplifies emitnative.c (which is already very large), and simplifies
the build system because emitnative.c no longer needs special handling for
compilation and qstr extraction.
This target removes any stray files (i.e. something not committed to git)
from scripts/ and modules/ dirs (or whatever FROZEN_DIR and FROZEN_MPY_DIR
is set to).
The expected workflow is:
1. make clean-frozen
2. micropython -m upip -p modules <packages_to_freeze>
3. make
As it can be expected that people may drop random thing in those dirs which
they can miss later, the content is actually backed up before cleaning.
Rationale:
* Calling Python build tool scripts from makefiles should be done
consistently using `python </path/to/script>`, instead of relying on the
correct she-bang line in the script [1] and the executable bit on the
script being set. This is more platform-independent.
* The name/path of the Python executable should always be used via the
makefile variable `PYTHON` set in `py/mkenv.mk`. This way it can be
easily overwritten by the user with `make PYTHON=/path/to/my/python`.
* The Python executable name should be part of the value of the makefile
variable, which stands for the build tool command (e.g. `MAKE_FROZEN` and
`MPY_TOOL`), not part of the command line where it is used. If a Python
tool is substituted by another (non-python) program, no change to the
Makefiles is necessary, except in `py/mkenv.mk`.
* This also solves #3369 and #1616.
[1] There are systems, where even the assumption that `/usr/bin/env` always
exists, doesn't hold true, for example on Android (where otherwise the unix
port compiles perfectly well).
This reverts commit 3289b9b7a7.
The commit broke building on MINGW because the filename became
micropython.exe.exe. A proper solution to support more Windows build
environments requires more thought and testing.
Building mpy-cross: this patch adds .exe to the PROG name when building
executables for host (eg mpy-cross) on Windows. make clean now removes
mpy-cross.exe under Windows.
Building MicroPython: this patch sets MPY_CROSS to mpy-cross.exe or
mpy-cross so they can coexist and use cygwin or WSL without rebuilding
mpy-cross. The dependency in the mpy rule now uses mpy-cross.exe for
Windows and mpy-cross for Linux.
For make v3.81, using "make -B" can set $? to empty and in this case the
auto-qstr generation needs to pass all args (ie $^) to cpp. The previous
fix for this (which was removed in 23a693ec2d)
used if statements in the shell command, which gave very long lines that
didn't work on certain systems (eg cygwin).
The fix in this patch is to use an $if(...) expression, which will evaluate
to $? (only newer prerequisites) if it's non empty, otherwise it will use
$^ (all prerequisites).
This ensures that mpy-cross is automatically built (and is up-to-date) for
ports that use frozen bytecode. It also makes sure that .mpy files are
re-built if mpy-cross is changed.
Build happens in 3 stages:
1. Zephyr config header and make vars are generated from prj.conf.
2. libmicropython is built using them.
3. Zephyr is built and final link happens.
When make is passed "-B" it seems that everything is considered out-of-date
and so $? expands to all prerequisites. Thus there is no need for a
special check to see if $? is emtpy.
With this patch one can now do "make FROZEN_MPY_DIR=../../frozen" to
specify a directory containing scripts to be frozen (as well as absolute
paths).
The compiled .mpy files are now stored in $(BUILD)/frozen_mpy/.
Now, to use frozen bytecode all a port needs to do is define
FROZEN_MPY_DIR to the directory containing the .py files to freeze, and
define MICROPY_MODULE_FROZEN_MPY and MICROPY_QSTR_EXTRA_POOL.
E.g. for stmhal, accumulated preprocessed output may grow large due to
bloated vendor headers, and then reprocessing tens of megabytes on each
build make take couple of seconds on fast hardware (=> potentially dozens
of seconds on slow hardware). So instead, split once after each change,
and only cat repetitively (guaranteed to be fast, as there're thousands
of lines involved at most).
If make -B is run, the rule is run with $? empty. Extract fron all file in
this case. But this gets fragile, really "make clean" should be used instead
with such build complexity.
When there're C files to be (re)compiled, they're all passed first to
preprocessor. QSTR references are extracted from preprocessed output and
split per original C file. Then all available qstr files (including those
generated previously) are catenated together. Only if the resulting content
has changed, the output file is written (causing almost global rebuild
to pick up potentially renumbered qstr's). Otherwise, it's not updated
to not cause spurious rebuilds. Related make rules are split to minimize
amount of commands executed in the interim case (when some C files were
updated, but no qstrs were changed).