Effect measured on esp8266 port:
Before:
>>> pystone_lowmem.main(10000)
Pystone(1.2) time for 10000 passes = 44214 ms
This machine benchmarks at 226 pystones/second
>>> pystone_lowmem.main(10000)
Pystone(1.2) time for 10000 passes = 44246 ms
This machine benchmarks at 226 pystones/second
After:
>>> pystone_lowmem.main(10000)
Pystone(1.2) time for 10000 passes = 44343ms
This machine benchmarks at 225 pystones/second
>>> pystone_lowmem.main(10000)
Pystone(1.2) time for 10000 passes = 44376ms
This machine benchmarks at 225 pystones/second
vstr_null_terminated_str is almost certainly a vstr finalization operation,
so it should add the requested NUL byte, and not try to pre-allocate more.
The previous implementation could actually allocate double of the buffer
size.
I checked the entire codebase, and every place that vstr_init_len
was called, there was a call to mp_obj_new_str_from_vstr after it.
mp_obj_new_str_from_vstr always tries to reallocate a new buffer
1 byte larger than the original to store the terminating null
character.
In many cases, if we allocated the initial buffer to be 1 byte
longer, we can prevent this extra allocation, and just reuse
the originally allocated buffer.
Asking to read 256 bytes and only getting 100 will still cause
the extra allocation, but if you ask to read 256 and get 256
then the extra allocation will be optimized away.
Yes - the reallocation is optimized in the heap to try and reuse
the buffer if it can, but it takes quite a few cycles to figure
this out.
Note by Damien: vstr_init_len should now be considered as a
string-init convenience function and used only when creating
null-terminated objects.
Previous to this patch the printing mechanism was a bit of a tangled
mess. This patch attempts to consolidate printing into one interface.
All (non-debug) printing now uses the mp_print* family of functions,
mainly mp_printf. All these functions take an mp_print_t structure as
their first argument, and this structure defines the printing backend
through the "print_strn" function of said structure.
Printing from the uPy core can reach the platform-defined print code via
two paths: either through mp_sys_stdout_obj (defined pert port) in
conjunction with mp_stream_write; or through the mp_plat_print structure
which uses the MP_PLAT_PRINT_STRN macro to define how string are printed
on the platform. The former is only used when MICROPY_PY_IO is defined.
With this new scheme printing is generally more efficient (less layers
to go through, less arguments to pass), and, given an mp_print_t*
structure, one can call mp_print_str for efficiency instead of
mp_printf("%s", ...). Code size is also reduced by around 200 bytes on
Thumb2 archs.
This cleans up vstr so that it's a pure "variable buffer", and the user
can decide whether they need to add a terminating null byte. In most
places where vstr is used, the vstr did not need to be null terminated
and so this patch saves code size, a tiny bit of RAM, and makes vstr
usage more efficient. When null termination is needed it must be
done explicitly using vstr_null_terminate.
With this patch str/bytes construction is streamlined. Always use a
vstr to build a str/bytes object. If the size is known beforehand then
use vstr_init_len to allocate only required memory. Otherwise use
vstr_init and the vstr will grow as needed. Then use
mp_obj_new_str_from_vstr to create a str/bytes object using the vstr
memory.
Saves code ROM: 68 bytes on stmhal, 108 bytes on bare-arm, and 336 bytes
on unix x64.
This patch allows to reuse vstr memory when creating str/bytes object.
This improves memory usage.
Also saves code ROM: 128 bytes on stmhal, 92 bytes on bare-arm, and 88
bytes on unix x64.
It seems most sensible to use size_t for measuring "number of bytes" in
malloc and vstr functions (since that's what size_t is for). We don't
use mp_uint_t because malloc and vstr are not Micro Python specific.
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
vstr is initially intended to deal with arbitrary-length strings. By
providing a bit lower-level API calls, it will be also useful to deal
with arbitrary-length I/O buffers (the difference from strings is that
buffers are filled from "outside", via I/O).
Another issue, especially aggravated by I/O buffer use, is alloc size
vs actual size length. If allocated 1Mb for buffer, but actually
read 1 byte, we don't want to keep rest of 1Mb be locked by this I/O
result, but rather return it to heap ASAP ("shrink" buffer before passing
it to qstr_from_str_take()).