Browse Source

Discussion on double-to-fastint conversion check

v1.0-maintenance
Sami Vaarala 10 years ago
parent
commit
483a82ce9b
  1. 269
      doc/tagged-integer-type.rst

269
doc/tagged-integer-type.rst

@ -199,3 +199,272 @@ In addition to these, user code may have some practical dependencies, such as:
Signed 41.6 fixed point provides a fractional increment of 0.015625; Signed 41.6 fixed point provides a fractional increment of 0.015625;
for the scheduler, this would mean about 15.6ms resolution, which is not for the scheduler, this would mean about 15.6ms resolution, which is not
that great. that great.
Efficient check for double-to-fastint conversion
================================================
Criteria
--------
For an IEEE double to be representable as a fast integer, it must be:
* A whole number
* In the 48-bit range
* Not a negative zero, assuming that the integer zero is taken to represent
a positive zero
What to optimize for
--------------------
This algorithm is needed when Duktape:
* Parses a number and checks whether to represent the number as a double or
a fastint
* Executes internal code with no fastint handling; in this case any fastint
inputs are first coerced to doubles and then back to fastints if the result
fits
* Executes internal code with fastint handling, with one or more of the
inputs not matching the fastint "fast path" but the result possibly fitting
into a fastint
The "fast path" for fastint operations doesn't execute this algorithm because
both inputs and outputs are fastints and Duktape detects this in the fast path
preconditions. Given this, an aggressive memory-speed tradeoff (e.g. a table
for each exponent) doesn't make sense.
The speed of this algorithm affects two scenarios:
1. Computations where the numbers involved are outside the fastint range. Here
it's important to quickly determine that a fastint representation is not
possible.
2. Computations where the numbers can be represented as fastints (at least some
of the time), but one or more operations don't have a fastint "fast path" so
that the numbers get upgraded to an IEEE double and then need to be downgraded
back to a fastint.
Both cases matter, but for typical embedded code the latter case matters more.
In other words, the code should be optimized for the case where a fastint fit
is possible.
Exponent and sign by cases
--------------------------
An IEEE double has a sign (1 bit), an exponent (11 bits), and a 52-bit stored
mantissa. The mantissa has an implicit (not stored) leading '1' digit, except
for denormals, NaNs, and infinities.
Going through the possible exponent values:
* If exponent is 0:
- The number is a fastint only if the sign bit is zero (positive) and the
entire mantissa is all zeroes. This corresponds to +0.
- If the mantissa is non-zero, the number is a denormal.
* If the exponent is in the range [1, 1022] the number is not a fastint
because the implicit mantissa bit corresponds to the number 0.5.
* If exponent is exactly 1023:
- The number is only a fastint if the stored mantissa is all zeroes.
This corresponds to +/- 1.
* If exponent is exactly 1024:
- The number is only a fastint if 51 lowest bits of the mantissa are all
zeroes. This corresponds to the numbers +/- 2 and +/- 3.
* Generalizing, if the exponent is in the range [1023,1069], the number is
a fastint if and only if:
- The lowest N bits of the mantissa are zero, where N = 52 - (exp - 1023),
with either sign.
- N can also be expressed as: N = 1075 - exp.
* If exponent is exactly 1070:
- The number is only a fastint if the sign bit is set (negative) and the
stored mantissa is all zeroes. This corresponds to -2^47. The positive
counterpart +2^47 does not fit into the fastint range.
* If exponent is [1071,2047] the number is never a fastint:
- For exponents [1071,2046] the number is too large to be a fastint.
- For exponent 2047 the number is a NaN or infinity depending on the
mantissa contents, neither a valid fastint.
Pseudocode 1
------------
The algorithm::
is_fastint(sgn, exp, mant):
if exp == 0:
return sign == 0 and mzero(mant, 52)
else if exp < 1023:
return false
else if exp < 1070:
return mzero(mant, 1075 - exp)
else if exp == 1070:
return sign == 1 and mzero(mant, 52)
else:
return false
The ``mzero`` helper predicate returns true if the mantissa given has its
lowest ``n`` bits zero.
Non-zero integers in the fastint range will fall into the case where a certain
computed number of low mantissa bits must be checked to be zero. As discussed
above, the algorithm should be optimized for the "input fits fastint" case.
Pseudocode 2
------------
Some rewriting::
is_fastint(sgn, exp, mant):
nzero = 1075 - exp
if nzero >= 52 and nzero <= 6: // exp 1023 ... exp 1069
// exponents 1023 to 1069: regular handling, common case
return mzero(mant, nzero)
else if nzero == 1075:
// exponent 0: irregular handling, but still common (positive zero)
return sign == 0 and mzero(mant, 52)
else if nzero == 5:
// exponent 1070: irregular handling, rare case
return sign == 1 and mzero(mant, 52)
else:
// exponents [1,1022] and [1071,2047], rare case
return false
C algorithm with a lookup table
-------------------------------
The common case ``nzero`` values are between [6, 52] and correspond to
mantissa masks. Compute a mask index instead as nzero - 6 = 1069 - exp::
duk_uint64_t mzero_masks[47] = {
0x000000000000003fULL, /* exp 1069, nzero 6 */
0x000000000000007fULL, /* exp 1068, nzero 7 */
0x00000000000000ffULL, /* exp 1067, nzero 8 */
0x00000000000001ffULL, /* exp 1066, nzero 9 */
/* ... */
0x0003ffffffffffffULL, /* exp 1025, nzero 50 */
0x0007ffffffffffffULL, /* exp 1024, nzero 51 */
0x000fffffffffffffULL, /* exp 1023, nzero 52 */
};
int is_fastint(duk_int64_t d) {
int exp = (d >> 52) & 0x07ff;
int idx = 1069 - exp;
if (idx >= 0 && idx <= 46) { /* exponents 1069 to 1023 */
return (mzero_masks[idx] & mant) == 0;
} else if (idx == 1069) { /* exponent 0 */
return (d >= 0) && ((d & 0x000fffffffffffffULL) == 0);
} else if (idx == -1) { /* exponent 1070 */
return (d < 0) && ((d & 0x000fffffffffffffULL) == 0);
} else {
return 0;
}
};
The memory cost of the mask table is 8x47 = 376 bytes. This can be halved
e.g. by using a table of 32-bit values with separate cases for nzero >= 32
and nzero < 32.
Unfortunately the expected case (exponents 1023 to 1069) involves a mask
check with a variable mask, so it may be unsuitable for direct inlining in
the most important hot spots.
C algorithm with a computed mask
--------------------------------
Since this algorithm only runs outside the proper fastint "fast path" it
may be more sensible to avoid a memory tradeoff and compute the masks::
int is_fastint(duk_int64_t d) {
int exp = (d >> 52) & 0x07ff;
int shift = exp - 1023;
if (shift >= 0 && shift <= 46) { /* exponents 1023 to 1069 */
return ((0x000fffffffffffffULL >> shift) & mant) == 0;
} else if (shift == -1023) { /* exponent 0 */
return (d >= 0) && ((d & 0x000fffffffffffffULL) == 0);
} else if (shift == 47) { /* exponent 1070 */
return (d < 0) && ((d & 0x000fffffffffffffULL) == 0);
} else {
return 0;
}
};
For middle endian machines (ARM) this algorithm first needs swapping
of the 32-bit parts. By changing the mask checks to operate on 32-bit
parts the algorithm would work on more platforms and would also remove
the need for swapping the parts on middle endian platforms.
C algorithm with 32-bit operations and a computed mask
------------------------------------------------------
::
int is_fastint(duk_uint32_t hi, duk_uint32_t lo) {
int exp = (hi >> 20) & 0x07ff;
int shift = exp - 1023;
if (shift >= 0 && shift <= 46) { /* exponents 1023 to 1069 */
if (shift <= 20) {
/* 0x000fffff'ffffffff -> 0x00000000'ffffffff */
return (((0x000fffffUL >> shift) & hi) == 0) && (lo == 0);
} else {
/* 0x00000000'ffffffff -> 0x00000000'0000003f */
return (((0xffffffffUL >> (shift - 20)) & lo) == 0);
}
} else if (shift == -1023) { /* exponent 0 */
/* return ((hi & 0x800fffffUL) == 0x00000000UL) && (lo == 0); */
return (hi == 0) && (lo == 0);
} else if (shift == 47) { /* exponent 1070 */
return ((hi & 0x800fffffUL) == 0x80000000UL) && (lo == 0);
} else {
return 0;
}
};
Future work
===========
Skipping the double-to-fastint test sometimes
---------------------------------------------
The double-to-fastint can safely err on the side of caution and decide to
represent a fastint-compatible number as a double. This opens up the
possibility of skipping the double-to-fastint test in some cases which
may improve performance and reduce code size.
For instance, when ``Math.cos()`` pushes its result on the stack, it's
probably quite a safe bet that the number won't fit a fastint, so it could
be written as a double directly without a double-to-fastint downgrade
check. In case it is a fastint (-1, 0, or 1) it will be represented as a
double but will be downgraded to a fastint by the first operation that
does execute the downgrade check. To support this, there could be a macro
like ``DUK_TVAL_SET_NUMBER_NOFASTINT``.
Another option is to run the double-to-fastint check randomly or e.g. only
every Nth time it is needed (N could be quite large, e.g. the prime 17).
This should be quite OK from a performance point of view. If a number is
incorrectly stored as a double and is involved in a lot of operations,
chances are it will get downgraded quite quickly, as long as the check
interval does not unluckily correlate with the downgrade check frequency.
This approach may not be worth it because an optimized fastint downgrade
check should have quite reasonable performance, and such an approach would
have no effect on the actual fastint fast path (inputs are fastints,
outputs are fastints).

Loading…
Cancel
Save