Fixes and improvements to `int.to_bytes()` are:
- No longer overflows if byte size is 0 (closes#13041).
- Raises OverflowError in any case where number won't fit into byte length
(now matches CPython, previously MicroPython would return a truncated
bytes object).
- Document that `micropython int.to_bytes()` doesn't implement the optional
signed kwarg, but will behave as if `signed=True` when the integer is
negative (this is the current behaviour). Add tests for this also.
Requires changes for small ints, MPZ large ints, and "long long" large
ints.
Adds a new set of unit tests for ints between 32 and 64 bits to increase
coverage of "long long" large ints, which are otherwise untested.
Tested on unix port (64 bit small ints, MPZ long ints) and Zephyr STM32WB
board (32 bit small ints, long long large ints).
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
The STATIC macro was introduced a very long time ago in commit
d5df6cd44a. The original reason for this was
to have the option to define it to nothing so that all static functions
become global functions and therefore visible to certain debug tools, so
one could do function size comparison and other things.
This STATIC feature is rarely (if ever) used. And with the use of LTO and
heavy inline optimisation, analysing the size of individual functions when
they are not static is not a good representation of the size of code when
fully optimised.
So the macro does not have much use and it's simpler to just remove it.
Then you know exactly what it's doing. For example, newcomers don't have
to learn what the STATIC macro is and why it exists. Reading the code is
also less "loud" with a lowercase static.
One other minor point in favour of removing it, is that it stops bugs with
`STATIC inline`, which should always be `static inline`.
Methodology for this commit was:
1) git ls-files | egrep '\.[ch]$' | \
xargs sed -Ei "s/(^| )STATIC($| )/\1static\2/"
2) Do some manual cleanup in the diff by searching for the word STATIC in
comments and changing those back.
3) "git-grep STATIC docs/", manually fixed those cases.
4) "rg -t python STATIC", manually fixed codegen lines that used STATIC.
This work was funded through GitHub Sponsors.
Signed-off-by: Angus Gratton <angus@redyak.com.au>
This commit makes sure that the value zero is always encoded in an mpz_t as
neg=0 and len=0 (previously it was just len=0).
This invariant is needed for some of the bitwise operations that operate on
negative numbers, because they cannot handle -0. For example
(-((1<<100)-(1<<100)))|1 was being computed as -65535, instead of 1.
Fixes issue #8042.
Signed-off-by: Damien George <damien@micropython.org>
Fixes the following (the line numbers match commit 0e87459e2b):
../../extmod/crypto-algorithms/sha256.c:49:19: runtime error: left shif...
../../extmod/moduasyncio.c:106:35: runtime error: member access within ...
../../py/binary.c:210:13: runtime error: left shift of negative value -...
../../py/mpz.c:744:16: runtime error: negation of -9223372036854775808 ...
../../py/objint.c:109:22: runtime error: left shift of 1 by 31 places c...
../../py/objint_mpz.c:374:9: runtime error: left shift of 4611686018427...
../../py/objint_mpz.c:374:9: runtime error: left shift of negative valu...
../../py/parsenum.c:106:14: runtime error: left shift of 46116860184273...
../../py/runtime.c:395:33: runtime error: left shift of negative value ...
../../py/showbc.c:177:28: runtime error: left shift of negative value -...
../../py/vm.c:321:36: runtime error: left shift of negative value -1```
Testing was done on an amd64 Debian Buster system using gcc-8.3 and these
settings:
CFLAGS += -g3 -Og -fsanitize=undefined
LDFLAGS += -fsanitize=undefined
The introduced TASK_PAIRHEAP macro's conditional (x ? &x->i : NULL)
assembles (under amd64 gcc 8.3 -Os) to the same as &x->i, since i is the
initial field of the struct. However, for the purposes of undefined
behavior analysis the conditional is needed.
Signed-off-by: Jeff Epler <jepler@gmail.com>
For certain operands to mpn_div, the existing code path for
`DIG_SIZE == MPZ_DBL_DIG_SIZE / 2` had a bug in it where borrow could still
overflow in the `(x >= *n || *n - x <= borrow)` branch, ie
`borrow + x - (mpz_dbl_dig_t)*n` overflows the borrow variable. In such
cases the subsequent right-shift of borrow would not bring in the overflow
bit, leading to an error in the result. An example division that had
overflow when MPZ_DIG_SIZE = 16 is `(2 ** 48 - 1) ** 2 // (2 ** 48 - 1)`.
This is fixed in this commit by simplifying the code and handling the low
digits of borrow first, and then the upper bits (to shift down) separately.
There is no longer a distinction between `DIG_SIZE < MPZ_DBL_DIG_SIZE / 2`
and `DIG_SIZE == MPZ_DBL_DIG_SIZE / 2`.
This commit also simplifies the second part of the calculation so that
borrow does not need to be negated (instead the code just works knowing
that borrow is negative and using + instead of - in calculations involving
borrow).
Fixes#6777.
Signed-off-by: Damien George <damien@micropython.org>
Note: the uncrustify configuration is explicitly set to 'add' instead of
'force' in order not to alter the comments which use extra spaces after //
as a means of indenting text for clarity.
Before this, ubsan would detect a problem when executing
hash(006699999999999999999999999999999999999999999999999999999999999999999999)
../../py/mpz.c:1539:20: runtime error: left shift of 1067371580458 by
32 places cannot be represented in type 'mp_int_t' (aka 'long')
When the overflow does occur it now happens as defined by the rules of
unsigned arithmetic.
This path for src->deg==NULL is never used because mpz_clone() is always
called with an argument that has a non-zero integer value, and hence has
some digits allocated to it (mpz_clone() is a static function private to
mpz.c all callers of this function first check if the integer value is zero
and if so take a special-case path, bypassing the call to mpz_clone()).
There is some unused and commented-out functions that may actually pass a
zero-valued mpz to mpz_clone(), so some TODOs are added to these function
in case they are needed in the future.
There are two checks that are always false so can be converted to (negated)
assertions to save code space and execution time. They are:
1. The check of the str parameter, which is required to be non-NULL as per
the original comment that it has enough space in it as calculated by
mp_int_format_size. And for all uses of this function str is indeed
non-NULL.
2. The check of the base parameter, which is already required to be between
2 and 16 (inclusive) via the assertion in mp_int_format_size.
The motivation behind this patch is to remove unreachable code in mpn_div.
This unreachable code was added some time ago in
9a21d2e070, when a loop in mpn_div was copied
and adjusted to work when mpz_dig_t was exactly half of the size of
mpz_dbl_dig_t (a common case). The loop was copied correctly but it wasn't
noticed at the time that the final part of the calculation of num-quo*den
could be optimised, and hence unreachable code was left for a case that
never occurred.
The observation for the optimisation is that the initial value of quo in
mpn_div is either exact or too large (never too small), and therefore the
subtraction of quo*den from num may subtract exactly enough or too much
(but never too little). Using this observation the part of the algorithm
that handles the borrow value can be simplified, and most importantly this
eliminates the unreachable code.
The new code has been tested with DIG_SIZE=3 and DIG_SIZE=4 by dividing all
possible combinations of non-negative integers with between 0 and 3
(inclusive) mpz digits.
Updated modbuiltin.c to add conditional support for 3-arg calls to
pow() using MICROPY_PY_BUILTINS_POW3 config parameter. Added support in
objint_mpz.c for for optimised implementation.
Previous to this patch bignum division and modulo would temporarily
modify the RHS argument to the operation (eg x/y would modify y), but on
return the RHS would be restored to its original value. This is not
allowed because arguments to binary operations are const, and in
particular might live in ROM. The modification was to normalise the arg
(and then unnormalise before returning), and this patch makes it so the
normalisation is done on the fly and the arg is now accessed as read-only.
This change doesn't increase the order complexity of the operation, and
actually reduces code size.
When DIG_SIZE=32, a uint32_t is used to store limbs, and no normalisation
is needed because the MSB is already set, then there will be left and
right shifts (in C) by 32 of a 32-bit variable, leading to undefined
behaviour. This patch fixes this bug.
This function computes (x**y)%z in an efficient way. For large arguments
this operation is otherwise not computable by doing x**y and then %z.
It's currently not used, but is added in case it's useful one day.
For these 3 bitwise operations there are now fast functions for
positive-only arguments, and general functions for arbitrary sign
arguments (the fast functions are the existing implementation).
By default the fast functions are not used (to save space) and instead
the general functions are used for all operations.
Enable MICROPY_OPT_MPZ_BITWISE to use the fast functions for positive
arguments.