Before this change, long/mpz ints propagated into all future calculations,
even if their value could fit in a small-int object. With this change, the
result of a big-int binary op will now be converted to a small-int object
if the value fits in a small-int.
For example, a relatively common operation like `x = a * b // c` where
a,b,c all small ints would always result in a long/mpz int, even if it
didn't need to, and then this would impact all future calculations with
x.
This adds +24 bytes on PYBV11 but avoids heap allocations and potential
surprises (e.g. `big-big` is now a small `0`, and can safely be accessed
with MP_OBJ_SMALL_INT_VALUE).
Performance tests are unchanged on PYBV10, except for `bm_pidigits.py`
which makes heavy use of big-ints and gains about 8% in speed.
Unix coverage tests have been updated to cover mpz code that is now
unreachable by normal Python code (removing the unreachable code would lead
to some surprising gaps in the internal C functions and the functionality
may be needed in the future, so it is kept because it has minimal
overhead).
This work was funded through GitHub Sponsors.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Fixes and improvements to `int.to_bytes()` are:
- No longer overflows if byte size is 0 (closes#13041).
- Raises OverflowError in any case where number won't fit into byte length
(now matches CPython, previously MicroPython would return a truncated
bytes object).
- Document that `micropython int.to_bytes()` doesn't implement the optional
signed kwarg, but will behave as if `signed=True` when the integer is
negative (this is the current behaviour). Add tests for this also.
Requires changes for small ints, MPZ large ints, and "long long" large
ints.
Adds a new set of unit tests for ints between 32 and 64 bits to increase
coverage of "long long" large ints, which are otherwise untested.
Tested on unix port (64 bit small ints, MPZ long ints) and Zephyr STM32WB
board (32 bit small ints, long long large ints).
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
To be consistent with MP_UNARY_OP_INT_FLOAT and MP_UNARY_OP_INT_COMPLEX,
and allow int() to first check if a type supports __int__ before trying
other things (as per CPython).
Signed-off-by: Damien George <damien@micropython.org>
Commit d96cfd13e3 introduced a regression by breaking existing
users of mp_obj_is_type(.., &mp_obj_bool). This function (and associated
helpers like mp_obj_is_int()) have some specific nuances, and mistakes like
this one can happen again.
This commit adds mp_obj_is_exact_type() which behaves like the the old
mp_obj_is_type(). The new mp_obj_is_type() has the same prototype but it
attempts to statically assert that it's not called with types which should
be checked using mp_obj_is_type(). If called with any of these types: int,
str, bool, NoneType - it will cause a compilation error. Additional
checked types (e.g function types) can be added in the future.
Existing users of mp_obj_is_type() with the now "invalid" types, were
translated to use mp_obj_is_exact_type().
The use of MP_STATIC_ASSERT() is not bulletproof - usually GCC (and other
compilers) can't statically check conditions that are only known during
link-time (like variables' addresses comparison). However, in this case,
GCC is able to statically detect these conditions, probably because it's
the exact same object - `&mp_type_int == &mp_type_int` is detected.
Misuses of this function with runtime-chosen types (e.g:
`mp_obj_type_t *x = ...; mp_obj_is_type(..., x);` won't be detected. MSC
is unable to detect this, so we use MP_STATIC_ASSERT_NOT_MSC().
Compiling with this commit and without the fix for d96cfd13e3 shows
that it detects the problem.
Signed-off-by: Yonatan Goldschmidt <yon.goldschmidt@gmail.com>
This replaces occurences of
foo_t *foo = m_new_obj(foo_t);
foo->base.type = &foo_type;
with
foo_t *foo = mp_obj_malloc(foo_t, &foo_type);
Excludes any places where base is a sub-field or when new0/memset is used.
Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
These macros could in principle be (inline) functions so it makes sense to
have them lower case, to match the other C API functions.
The remaining macros that are upper case are:
- MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR
- MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE
- MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE
- MP_OBJ_FUN_MAKE_SIG
- MP_DECLARE_CONST_xxx
- MP_DEFINE_CONST_xxx
These must remain macros because they are used when defining const data (at
least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have
MP_OBJ_SMALL_INT_VALUE also a macro).
For those macros that have been made lower case, compatibility macros are
provided for the old names so that users do not need to change their code
immediately.
The LHS passed to mp_obj_int_binary_op() will always be an integer, either
a small int or a big int, so the test for this type doesn't need to include
an "other, unsupported type" case.
Header files that are considered internal to the py core and should not
normally be included directly are:
py/nlr.h - internal nlr configuration and declarations
py/bc0.h - contains bytecode macro definitions
py/runtime0.h - contains basic runtime enums
Instead, the top-level header files to include are one of:
py/obj.h - includes runtime0.h and defines everything to use the
mp_obj_t type
py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h,
and defines everything to use the general runtime support functions
Additional, specific headers (eg py/objlist.h) can be included if needed.
This allows user classes to implement __abs__ special method, and saves
code size (104 bytes for x86_64), even though during refactor, an issue
was fixed and few optimizations were made:
* abs() of minimum (negative) small int value is calculated properly.
* objint_longlong and objint_mpz avoid allocating new object is the
argument is already non-negative.
The unary-op/binary-op enums are already defined, and there are no
arithmetic tricks used with these types, so it makes sense to use the
correct enum type for arguments that take these values. It also reduces
code size quite a bit for nan-boxing builds.
The standard preprocessor definition to differentiate debug and non-debug
builds is NDEBUG, not DEBUG, so don't rely on the latter:
- just delete the use of it in objint_longlong.c as it has been stale code
for years anyway (since commit [c4029e5]): SUFFIX isn't used anywhere.
- replace DEBUG with MICROPY_DEBUG_NLR in nlr.h: it is rarely used anymore
so can be off by default
Hashing is now done using mp_unary_op function with MP_UNARY_OP_HASH as
the operator argument. Hashing for int, str and bytes still go via
fast-path in mp_unary_op since they are the most common objects which
need to be hashed.
This lead to quite a bit of code cleanup, and should be more efficient
if anything. It saves 176 bytes code space on Thumb2, and 360 bytes on
x86.
The only loss is that the error message "unhashable type" is now the
more generic "unsupported type for __hash__".
This fixes conversion when float type has more mantissa bits than small int,
and float value has small exponent. This is for example the case of 32-bit
platform using doubles, and converting value of time.time(). Conversion of
floats with larg exponnet is still not handled correctly.
mp_obj_int_get_truncated is used as a "fast path" int accessor that
doesn't check for overflow and returns the int truncated to the machine
word size, ie mp_int_t.
Use mp_obj_int_get_truncated to fix struct.pack when packing maximum word
sized values.
Addresses issues #779 and #998.
Type representing signed size doesn't have to be int, so use special value
which defaults to SSIZE_MAX, but as it's not defined by C standard (but rather
by POSIX), allow ports to set it.