Before this change, long/mpz ints propagated into all future calculations,
even if their value could fit in a small-int object. With this change, the
result of a big-int binary op will now be converted to a small-int object
if the value fits in a small-int.
For example, a relatively common operation like `x = a * b // c` where
a,b,c all small ints would always result in a long/mpz int, even if it
didn't need to, and then this would impact all future calculations with
x.
This adds +24 bytes on PYBV11 but avoids heap allocations and potential
surprises (e.g. `big-big` is now a small `0`, and can safely be accessed
with MP_OBJ_SMALL_INT_VALUE).
Performance tests are unchanged on PYBV10, except for `bm_pidigits.py`
which makes heavy use of big-ints and gains about 8% in speed.
Unix coverage tests have been updated to cover mpz code that is now
unreachable by normal Python code (removing the unreachable code would lead
to some surprising gaps in the internal C functions and the functionality
may be needed in the future, so it is kept because it has minimal
overhead).
This work was funded through GitHub Sponsors.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Fixes and improvements to `int.to_bytes()` are:
- No longer overflows if byte size is 0 (closes#13041).
- Raises OverflowError in any case where number won't fit into byte length
(now matches CPython, previously MicroPython would return a truncated
bytes object).
- Document that `micropython int.to_bytes()` doesn't implement the optional
signed kwarg, but will behave as if `signed=True` when the integer is
negative (this is the current behaviour). Add tests for this also.
Requires changes for small ints, MPZ large ints, and "long long" large
ints.
Adds a new set of unit tests for ints between 32 and 64 bits to increase
coverage of "long long" large ints, which are otherwise untested.
Tested on unix port (64 bit small ints, MPZ long ints) and Zephyr STM32WB
board (32 bit small ints, long long large ints).
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
The STATIC macro was introduced a very long time ago in commit
d5df6cd44a. The original reason for this was
to have the option to define it to nothing so that all static functions
become global functions and therefore visible to certain debug tools, so
one could do function size comparison and other things.
This STATIC feature is rarely (if ever) used. And with the use of LTO and
heavy inline optimisation, analysing the size of individual functions when
they are not static is not a good representation of the size of code when
fully optimised.
So the macro does not have much use and it's simpler to just remove it.
Then you know exactly what it's doing. For example, newcomers don't have
to learn what the STATIC macro is and why it exists. Reading the code is
also less "loud" with a lowercase static.
One other minor point in favour of removing it, is that it stops bugs with
`STATIC inline`, which should always be `static inline`.
Methodology for this commit was:
1) git ls-files | egrep '\.[ch]$' | \
xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/"
2) Do some manual cleanup in the diff by searching for the word STATIC in
comments and changing those back.
3) "git-grep STATIC docs/", manually fixed those cases.
4) "rg -t python STATIC", manually fixed codegen lines that used STATIC.
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
To be consistent with MP_UNARY_OP_INT_FLOAT and MP_UNARY_OP_INT_COMPLEX,
and allow int() to first check if a type supports __int__ before trying
other things (as per CPython).
Signed-off-by: Damien George <damien@micropython.org>
Commit d96cfd13e3 introduced a regression by breaking existing
users of mp_obj_is_type(.., &mp_obj_bool). This function (and associated
helpers like mp_obj_is_int()) have some specific nuances, and mistakes like
this one can happen again.
This commit adds mp_obj_is_exact_type() which behaves like the the old
mp_obj_is_type(). The new mp_obj_is_type() has the same prototype but it
attempts to statically assert that it's not called with types which should
be checked using mp_obj_is_type(). If called with any of these types: int,
str, bool, NoneType - it will cause a compilation error. Additional
checked types (e.g function types) can be added in the future.
Existing users of mp_obj_is_type() with the now "invalid" types, were
translated to use mp_obj_is_exact_type().
The use of MP_STATIC_ASSERT() is not bulletproof - usually GCC (and other
compilers) can't statically check conditions that are only known during
link-time (like variables' addresses comparison). However, in this case,
GCC is able to statically detect these conditions, probably because it's
the exact same object - `&mp_type_int == &mp_type_int` is detected.
Misuses of this function with runtime-chosen types (e.g:
`mp_obj_type_t *x = ...; mp_obj_is_type(..., x);` won't be detected. MSC
is unable to detect this, so we use MP_STATIC_ASSERT_NOT_MSC().
Compiling with this commit and without the fix for d96cfd13e3 shows
that it detects the problem.
Signed-off-by: Yonatan Goldschmidt <yon.goldschmidt@gmail.com>
This replaces occurences of
foo_t *foo = m_new_obj(foo_t);
foo->base.type = &foo_type;
with
foo_t *foo = mp_obj_malloc(foo_t, &foo_type);
Excludes any places where base is a sub-field or when new0/memset is used.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This string is recognised by uncrustify, to disable formatting in the
region marked by these comments. This is necessary in the qstrdef*.h files
to prevent modification of the strings within the Q(...). In other places
it is used to prevent excessive reformatting that would make the code less
readable.
Can be used where mp_obj_int_get_checked() will overflow due to the
sign-bit solely. This returns an mp_uint_t, so it also verifies the given
integer is not negative.
Currently implemented only for mpz configurations.
Prior to this commit, building the unix port with `DEBUG=1` and
`-finstrument-functions` the compilation would fail with an error like
"control reaches end of non-void function". This change fixes this by
removing the problematic "if (0)" branches. Not all branches affect
compilation, but they are all removed for consistency.
These macros could in principle be (inline) functions so it makes sense to
have them lower case, to match the other C API functions.
The remaining macros that are upper case are:
- MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR
- MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE
- MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE
- MP_OBJ_FUN_MAKE_SIG
- MP_DECLARE_CONST_xxx
- MP_DEFINE_CONST_xxx
These must remain macros because they are used when defining const data (at
least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have
MP_OBJ_SMALL_INT_VALUE also a macro).
For those macros that have been made lower case, compatibility macros are
provided for the old names so that users do not need to change their code
immediately.
The LHS passed to mp_obj_int_binary_op() will always be an integer, either
a small int or a big int, so the test for this type doesn't need to include
an "other, unsupported type" case.
Before this patch MP_BINARY_OP_IN had two meanings: coming from bytecode it
meant that the args needed to be swapped, but coming from within the
runtime meant that the args were already in the correct order. This lead
to some confusion in the code and comments stating how args were reversed.
It also lead to 2 bugs: 1) containment for a subclass of a native type
didn't work; 2) the expression "{True} in True" would illegally succeed and
return True. In both of these cases it was because the args to
MP_BINARY_OP_IN ended up being reversed twice.
To fix these things this patch introduces MP_BINARY_OP_CONTAINS which
corresponds exactly to the __contains__ special method, and this is the
operator that built-in types should implement. MP_BINARY_OP_IN is now only
emitted by the compiler and is converted to MP_BINARY_OP_CONTAINS by
swapping the arguments.
Header files that are considered internal to the py core and should not
normally be included directly are:
py/nlr.h - internal nlr configuration and declarations
py/bc0.h - contains bytecode macro definitions
py/runtime0.h - contains basic runtime enums
Instead, the top-level header files to include are one of:
py/obj.h - includes runtime0.h and defines everything to use the
mp_obj_t type
py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
and defines everything to use the general runtime support functions
Additional, specific headers (eg py/objlist.h) can be included if needed.
This allows user classes to implement __abs__ special method, and saves
code size (104 bytes for x86_64), even though during refactor, an issue
was fixed and few optimizations were made:
* abs() of minimum (negative) small int value is calculated properly.
* objint_longlong and objint_mpz avoid allocating new object is the
argument is already non-negative.
This is to allow to place reverse ops immediately after normal ops, so
they can be tested as one range (which is optimization for reverse ops
introduction in the next patch).
The unary-op/binary-op enums are already defined, and there are no
arithmetic tricks used with these types, so it makes sense to use the
correct enum type for arguments that take these values. It also reduces
code size quite a bit for nan-boxing builds.
Updated modbuiltin.c to add conditional support for 3-arg calls to
pow() using MICROPY_PY_BUILTINS_POW3 config parameter. Added support in
objint_mpz.c for for optimised implementation.
If result guaranteedly fits in a small int, it is handled in objint.c.
Otherwise, it is delegated to mp_obj_int_from_bytes_impl(), which should
be implemented by individual objint_*.c, similar to
mp_obj_int_to_bytes_impl().
This allows the mp_obj_t type to be configured to something other than a
pointer-sized primitive type.
This patch also includes additional changes to allow the code to compile
when sizeof(mp_uint_t) != sizeof(void*), such as using size_t instead of
mp_uint_t, and various casts.
When creating constant mpz's, the length of the mpz must be exactly how
many digits are used (not allocated) otherwise these numbers are not
compatible with dynamically allocated numbers.
Addresses issue #1448.
Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as
the operator argument. Hashing for int, str and bytes still go via
fast-path in mp_unary_op since they are the most common objects which
need to be hashed.
This lead to quite a bit of code cleanup, and should be more efficient
if anything. It saves 176 bytes code space on Thumb2, and 360 bytes on
x86.
The only loss is that the error message "unhashable type" is now the
more generic "unsupported type for __hash__".