You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

954 lines
35 KiB

============
Fastint type
============
Overview
========
Ecmascript has a single number type which is required to be an IEEE double.
This is a potential performance issue in some embedded environments where
hardware floating point numbers (at least IEEE doubles) are not available
and software floating point emulation performs poorly.
Duktape provides optional support for fast integers or "fastints" which
allows Duktape to represent numbers internally either as IEEE doubles or
48-bit signed integers. Duktape will transparently upgrade integers to
doubles when necessary (e.g. when an integer operation overflows) and
downgrade doubles to integers when possible.
Because a double-to-integer downgrade check is relatively expensive, it is
only applied in specific situations. Currently:
* All compiler constants are represented as fastints if possible.
* Unary plus performs a ToNumber() coercion and also downgrades an IEEE
double to a fastint if possible.
* All function return values are automatically downgraded to fastints if
possible.
* Thread yield/resume values are automatically downgraded to fastints if
possible.
Fastints don't affect Ecmascript semantics and are completely transparent
to user C and Ecmascript code: all conversions are automatic.
To enable fastint support, simply define:
* ``DUK_OPT_FASTINT`` / ``DUK_USE_FASTINT``
You should measure the impact of enabling fastint support for your target
platform and Ecmascript code base. Fastint support is not an automatic
performance win: while the fast path is a clear improvement for soft float
(and even some hard float) platforms, there is a run-time cost of doing
fastint downgrade checks and other book-keeping. Very roughly:
* Code that benefits most from fastint upsides (e.g. heavy integer arithmetic
in large loops) can run about 1000% faster on soft float platforms.
* Code that suffers most from fastint downsides can run about 10% more
slowly.
* Executable size will increase by about 7-10kB.
This document provides tips for using fastints, and provides some background
on the approach chosen. Some specific fastint algorithms used by Duktape are
also described in detail.
Application considerations
==========================
Because fastints are transparent to user code, the only real consideration is
to make sure performance critical sections take advantage of fastints. Some
tips for using fastints:
* Because a double-to-fastint downgrade check is only done for specific
operations, make sure that integer values don't accidentally become
IEEE doubles.
There's no easy way to check how a number is represented internally.
However, ``Duktape.info()`` provides a way to peek into the internal
representation. An example algorithm is provided in
``polyfills/duktape-isfastint.js``. You can use this polyfill to debug
your code if necessary.
* When in doubt, you can use unary plus to force a number to be downgrade
checked::
// Result is exactly 1, but is represented internally as a double.
var t = Math.PI / Math.PI;
// Result is exactly 1, downgrade checked, and is represented
// internally as a fastint.
var t = +(Math.PI / Math.PI);
* All function return values from both Ecmascript and Duktape/C functions
are automatically downgraded to fastints. So, the following value can be
trusted to be 3 and represented internally as a fastint::
// Resulting 'three' is a fastint because Math.floor() return
// value (double 3) is automatically downgraded to a fastint.
var three = Math.floor(Math.PI);
Same applies to any user functions::
function my_max(a, b) {
// For the call below, 'b' is 1 but is not represented as a
// fastint here. Only when we return is the return value 1
// downgraded into a fastint.
return (a >= b ? a : b);
}
// 't' is exactly 1, and represented internally as a fastint.
var t = my_max(0, Math.PI / Math.PI);
* All compiler constants are automatically downgraded to fastints when
possible. For example, all constants below will be fastints::
var i, n;
for (i = 0, n = 1e6; i < n; i++) {
// All 'i' values here will be fastints.
}
* Note that the number syntax doesn't affect the fastint downgrade check,
only the final value matters. All of the following will be represented
as fastints::
t = 1;
t = 1.0;
t = 100e-2;
t = 0.01e2;
Similarly constant folding, when possible, will be done before doing the
downgrade check, so the following will be represented as a fastint::
t = 123.123 / 123.123; // fastint 1
But because ``Math.PI`` needs a runtime lookup, the following will not be
a fastint::
t = Math.PI / Math.PI; // double 1
* Non-fastint values will "taint" fastints in operations so that the result
will be represented as a double instead of a fastint::
t1 = 123; // fastint
t2 = 0.5; // double
t3 = t1 + t2; // <fastint> + <double> -> <double>
t4 = t3 - t2; // <double> - <double> -> <double>
t5 = +t4; // restore into fastint representation
While adding and subtracting ``t2`` is a net zero change and ``t4`` would
be fastint compatible, it will not be represented as a fastint internally
until the next explicit downgrade check. Here unary plus is used to get
the result back into fastint representation.
* Negative zero cannot be represented as a fastint. Ordinary Ecmascript
code will very rarely deal with negative zeros. Negative zero can "taint"
a fastint, too::
t1 = 123; // fastint
t2 = -0; // double
t3 = t1 + t2; // <fastint> + <double> -> <double> (!)
Here the result is a double even when an innocent zero value is added to
a fastint. When in doubt you can use unary plus to ensure the result is
a fastint if it's fastint compatible.
* When doing Duktape API calls from C code, prefer API calls which take
integer arguments. Such API calls will typically have fastint support.
For example::
// Value pushed will be 1, represented internally as a double.
duk_push_number(ctx, 1.0);
// Value pushed will be 1, represented internally as a fastint.
duk_push_int(ctx, 1);
* Because the fastint support is transparent from a semantics perspective,
Duktape fastint fast path and downgrade behavior may change in future
versions. Such changes won't change outward behavior but may affect
code performance.
As a general rule, optimize for fastints only in code sections where it
really matters for performance, e.g. heavy loops.
Detecting that a number is represented as a fastint internally
==============================================================
There's no explicit API for this now, but ``Duktape.info()`` provides the
necessary information (in a highly fragile manner though). For instance,
you can use something like::
/* Fastint tag depends on duk_tval packing */
var fastintTag = (Duktape.info(true)[1] >= 0xfff0 ?
0xfff1 /* tag for packed duk_tval) :
1 /* tag for unpacked duk_tval */ );
function isFastint(x) {
if (typeof x !== 'number') {
return false;
}
return Duktape.info(x)[1] === fastintTag;
}
There's an example polyfill which provides ``Duktape.isFastint()`` in:
* polyfills/duktape-isfastint.js
.. note:: This is fragile and may stop working when internal tag number
changes are made. Such changes are possible even in minor version
updates.
Fastints and Duktape internals
==============================
A few notes on how fastints are used internally, what macros are used, etc.
Fastint aware vs. unware code
-----------------------------
Fastint support is optional and added between ifdefs::
#if defined(DUK_USE_FASTINT)
...
#endif
Number handling will be either:
* fastint unaware: requires no changes to existing code
* fastint aware: requires fastint detection e.g. in switch-case statements
and then usage of fastint aware macros
Type switch cases
-----------------
The minimum change necessary is to ensure fastints are handled in type
switch-cases::
/* ... */
switch(DUK_TVAL_GET_TAG(tv)) {
case DUK_TAG_UNDEFINED:
/* ... */
#if defined(DUK_USE_FASTINT)
case DUK_TAG_FASTINT:
/* no direct support, fall through */
#endif
default:
/* number, double or fastint; use fastint unaware macros
* which will automatically upgrade a fastint to a double
* when necessary:
*/
duk_double_t d = DUK_TVAL_GET_NUMBER(tv); /* auto upgrade */
/* ... */
}
Even without this change the default clause will capture ``DUK_TAG_FASTINT``
values but it's preferable to have the fall through happen explicitly.
Fastint aware code will have specific code in the ``DUK_TAG_FASTINT`` case,
and the ``default`` case can then assume the number is represented as a
double. The ``default`` case must be written carefully so that it also works
correctly when fastints are disabled.
Getting numbers/fastints
------------------------
Fastint unaware code uses::
DUK_TVAL_GET_NUMBER(tv)
which will always evaluate to a double, and automatically upgrades a fastint
to a double. The implementation with fastints enabled is something like::
#define DUK_TVAL_GET_NUMBER(v) \
(DUK_TVAL_IS_FASTINT(v) ? \
(duk_double_t) DUK_TVAL_GET_FASTINT(v) : \
DUK_TVAL_GET_DOUBLE(v))
The extra compared to a direct read has a small runtime cost, but only when
fastints are enabled. When they're not enabled, ``DUK_TVAL_GET_NUMBER()``
will just read a double.
Fastint aware code uses the following::
/* When 'tv' is known to be a fastint, e.g. switch DUK_TAG_FASTINT or
* explicit check.
*/
DUK_TVAL_GET_FASTINT(tv) /* result is duk_int64_t */
/* When 'tv' is known to be a fastint, and we just need the lowest 32 bits
* as a duk_uint32_t.
*/
DUK_TVAL_GET_FASTINT_U32(tv) /* result is duk_uint32_t */
/* Similarly for a duk_int32_t. */
DUK_TVAL_GET_FASTINT_I32(tv) /* result is duk_int32_t */
/* When 'tv' is known to be a double, e.g. switch or explicit check. */
DUK_TVAL_GET_DOUBLE(tv)
The ``DUK_TVAL_GET_DOUBLE(tv)`` macro is also defined when fastints are not
enabled; in that case it's simply a synonym for ``DUK_TVAL_GET_NUMBER()``
because all numbers are represented as doubles. It should only be used when
in the fastint enabled case the number is known to be represented as a double.
This allows control structures like::
/* Fictional ToBoolean()-like operation. */
switch(DUK_TVAL_GET_TAG(tv)) {
...
#if defined(DUK_USE_FASTINT)
case DUK_TAG_FASTINT:
/* Fastints enabled and 'tv' is a fastint. */
return (DUK_TVAL_GET_FASTINT(tv) != 0 ? 1 : 0);
#endif
default:
/* Fastints enabled and 'tv' is a double, or fastints disabled. */
return (DUK_TVAL_GET_DOUBLE(tv) != 0.0 ? 1 : 0);
}
Setting numbers/fastints
------------------------
Fastint unaware code uses::
DUK_TVAL_SET_NUMBER(tv, d);
This sets the number always into an internal double representation, i.e.
no double-to-fastint downgrade is automatically done. (This was one
design option, but it turns out double-to-fastint coercion test is quite
expensive and adds a considerable overhead to the fastint unaware slow
path.)
Fastint aware which wants to set a double and downgrade it automatically
into a fastint when possible uses::
DUK_TVAL_SET_NUMBER_CHKFAST(tv, d);
This macro concretely calls into a helper function so there's a performance
penalty involved. Downgrade checks are only added to specific places where
they provide the most benefit.
Fastint aware code which wants to set a double explicitly (with no fastint
downgrade check) uses::
DUK_TVAL_SET_DOUBLE(tv, d);
Fastint aware code which wants to set a fastint explicitly (and has ensured
that the value is fastint compatible) uses::
/* 'i' must be in 48-bit signed range */
DUK_TVAL_SET_FASTINT(tv, i); /* i is duk_int64_t */
/* 'i' must be in 32-bit unsigned range */
DUK_TVAL_SET_FASTINT_U32(tv, i); /* i is duk_uint32_t */
/* 'i' must be in 32-bit signed range */
DUK_TVAL_SET_FASTINT_I32(tv, i); /* i is duk_int32_t */
The macros are also available when fastints are disabled, and will just
write a double with no checks or additional overhead. This is just a
convenience to reduce the number of ifdefs in call sites. For example,
``DUK_TVAL_SET_FASTINT_U32`` coerces the uint32 argument to a double
when fastints are disabled.
In-place double-to-fastint downgrade check
------------------------------------------
The following macro is used to perform an in-place double-to-fastint
downgrade check::
DUK_TVAL_CHKFAST_INPLACE(tv);
The target 'tv' can have any type; the macro first checks if the value
is a double and if so, if it can be fastint coerced.
When fastint support is disabled, the macro is a no-op.
Type checks
-----------
Fastint unaware code checks for a number (either double or fastint) using::
DUK_TVAL_IS_NUMBER(tv)
Fastint aware code uses::
/* Number represented as a fastint */
DUK_TVAL_IS_FASTINT(tv)
/* Number represented as a double */
DUK_TVAL_IS_DOUBLE(tv)
The following is defined even when fastints are disabled to support the
switch code structure described above::
/* When fastints disabled, same as DUK_TVAL_IS_NUMBER() */
DUK_TVAL_IS_DOUBLE(tv)
Background
==========
This section provides some background, discussion, and issues on various
approaches to integer support. It's not up to date with the current
implementation.
Approaches to integer support
-----------------------------
* Replace the tagged IEEE double number type with an integer or a fixed point
type. This will necessarily break Ecmascript compliance to some extent, but
it would be nice if at least number range was sufficient for 32-bit bit ops
and to represent e.g. Dates.
* Same as above, but also reserve a few bits for one or more special values
like NaNs, to maintain compatibility better. For instance, NaN is used to
signify an invalid Date, and is also used as a coercion result to signal a
coercion error.
* Extend the tagged type to support both an IEEE double and an integer or a
fixed point type. Convert between the two either fully transparently (to
maintain full Ecmascript semantics) or in selected situations, chosen for
either convenience or performance.
* Extend the tagged type to support both an IEEE double and an integer or a
fixed point type. Extend the public API and Ecmascript environment to
expose the new integer type explicitly. The upside is minimal performance
cost because there are fewer automatic conversion checks. The downside is
a significant API change and introduction of custom language features.
* Same as above, but expose the integer type only for user C code; keep the
Ecmascript environment unaware of the change.
Implementation issues
---------------------
* When there is no need to represent IEEE doubles, the 8-byte tagged duk_tval
no longer needs to conform to the IEEE double constraints (NaN space reuse).
Instead, it can be split e.g. into an 8-bit tag and 56-bit type-specific
value.
* When there is a need to represent both integers and IEEE doubles, the 8-byte
duk_tval must conform to the IEEE double representation, i.e. there are 16
bits of a special tag value and 48-bit type specific value.
* Should there be a C typedef for a Duktape number? Currently the public
API and Duktape internals assume numbers can be read/written as doubles.
Changing the public API will break compilation (or at least cause warnings)
for user code, if the integer changes are visible in the API.
* Does the integer change need to be made everywhere at once, so that all
code (including the compiler, etc) must support the underlying integer
type before the change is complete?
Alternatively, Duktape could read and write numbers as doubles by default
internally (with automatic conversion back and forth as needed) and
integer-aware optimizations would only be applied in places where it matters,
such as arithmetic. In particular, there would be no need to deal with
integer representation in the compiler as it would normally have a minimal
impact.
* Integer representations above 32 bits would normally use a 64-bit integer
type for arithmetic. However, some older platforms don't have such a type
(there are workarounds for this e.g. in ``duk_numconv.c``). So either the
integer arithmetic must also be implemented with 32-bit replacements, or
the representation won't be available if 64-bit types are not available.
Representation options
----------------------
Double type + separate integer / fixed point type (compliant)
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
In this case the 8-byte tagged type must conform to the IEEE NaN space
reuse, so 16 bits are lost to the type tag and 48 bits are available
for the value.
* Double and up to 48-bit integer (sign + 47-bit range). Integers are nice
and intuitive, but won't fit the full 53-bit integer range supported by
IEEE doubles, so some must fall back into the double representation (not a
big limitation). Date values and binary operations work.
* Double and a fixed point with up to 48 bit representation, e.g. sign +
41.6. To support reasonable Date values, the integer part must be at least
41 bits. To support bit operations without falling back to IEEE doubles,
the integer part must support both signed and unsigned 32-bit values.
Binary fractions require some additional shifting to implement, and user
code is not very likely to contain specific binary fractions, so they would
only benefit code specifically crafted to use them.
* Double and 32-bit signed or unsigned integer: 32-bit arithmetic is nice
but unfortunately not enough to support Ecmascript bit operations which
require the range -0x80000000 to 0xffffffff (sign + 32 bits, a 33-bit
representation). This would not be a compliance issue as Duktape would
fall back to the IEEE double for some values, but if fast bit operations
are important matter, this is not a good option. If bit operations don't
matter, then this is a nice option in that it avoids the 64-bit arithmetic
issue.
Only integer / fixed point type (non-compliant)
:::::::::::::::::::::::::::::::::::::::::::::::
Here the 8-byte tagged type can be split e.g. into a 8-bit type and a 56-bit
value which allows more range.
* 56-bit signed integer (sign + 55 bits): covers the IEEE integer range
(53-bit), Date values work, bit ops work. Lack of any fractions makes
built-in Math functions mostly useless (e.g. Math.random() will always
return zero), and some user code is likely to break.
* Sign and 47.8 or 45.10 fixed point: provides enough fractions to be
useful, Date values work, bit ops work. Math functions are somewhat
useful again.
* Sign and 41.14 fixed point: maximum number of fraction bits while keeping
Date values (and bit ops) working.
* Sign and 32.23 fixed point: maximum number of fraction bits while keeping
bit ops working and providing user code the reasonable and intuitive
guarantee that 32-bit integers (signed and unsigned) work. Date values
won't work.
* 32-bit unsigned integer or 32-bit signed integer: closest to what's fast
and convenient on typical embedded systems, but some bit operations stop
working because taken together they need the -0x80000000 to 0xffffffff
range (there are both signed and unsigned bit ops). Date values won't
work.
Dependencies on IEEE double or range
------------------------------------
Specification and Duktape dependencies:
* Signed integers are quite widely required, so having no support for negative
values is probably not an option.
* At least 32-bit unsigned integers are needed for array and string lengths.
* A sign + a 32-bit range (33-bit representation) are needed for bit ops,
which provide both signed and unsigned 32-bit results. The required range
is -0x80000000 to 0xffffffff.
* The Date built-in uses an integer millisecond value for time values. This
representation is used both internally and in the external Date API.
- 40 (unsigned) bits is not enough to represent the current time, it only
represents timestamps up to November 2004.
- 41 (unsigned) bits is enough to represent timestamps up to September
2039.
- The Date API never uses fractions, and in fact the specification requires
that the internal value is integer coerced (to milliseconds), so Date
does not require fractions to work properly.
- The implication for using only an integer / fixed point representation
is that the integer part must contain a sign and at least 41 bits.
For example, for a 48-bit representation sign + 41.6 fixed point is
enough, and would provide 1/64 fractions.
- It would be easy to fix the internal Date representation to work with any
fixed point representation with enough bits (e.g. sign + 32.15), but
because the integer millisecond values are used in the public Date API
too, this doesn't solve anything.
* Signed zero semantics (separation of negative and positive zero) are
are required and explicitly specified, but Ecmascript itself doesn't
really depend on being able to use a negative zero, and neither does
Duktape.
* NaN values are used in several places as significant internal or
external values. Invalid Date values are represented by having a
NaN as the Date object's internal time value. String-to-number
coercion relies on using a NaN to indicate a coercion error
(``Number('foo') === NaN``). If a NaN value is not available, the
best replacement is probably zero.
* Infinities are used in math functions but Ecmascript itself doesn't
rely on being able to use them, and neither does Duktape.
* Duktape packs some internal values into double representation, this is
used at least by:
- The compiler for declaration book-keeping. The needed bit count is
not large (32 bits should more than suffice, for 2**24 inner functions).
- Error object tracedata format, which needs 32 bits + a few flags;
40 bits should suffice.
In addition to these, user code may have some practical dependencies, such as:
* Being able to represent at least signed and unsigned 32 bits, so that all
Ecmascript bit operations work as expected.
* Being able to represent at least some fractional values. For instance,
suppose a custom scheduler used second-based timestamps for timers; it
would then require a reasonable number of fractions to work properly.
Signed 41.6 fixed point provides a fractional increment of 0.015625;
for the scheduler, this would mean about 15.6ms resolution, which is not
that great.
Efficient check for double-to-fastint downgrade
===============================================
Overview
--------
For an IEEE double to be representable as a fast integer, it must be:
* A whole number
* In the signed 48-bit range
* Not a negative zero, assuming that the integer zero is taken to represent
a positive zero
This algorithm is needed when Duktape does an explicit downgrade check to see
if a double value can be represented as a fastint.
The "fast path" for fastint operations doesn't execute this algorithm because
both inputs and outputs are fastints and Duktape detects this in the fast path
preconditions. Even so the performance of the downgrade check matters for
overall performance.
Exponent and sign by cases
--------------------------
An IEEE double has a sign (1 bit), an exponent (11 bits), and a 52-bit stored
mantissa. The mantissa has an implicit (not stored) leading '1' digit, except
for denormals, NaNs, and infinities.
Going through the possible exponent values:
* If exponent is 0:
- The number is a fastint only if the sign bit is zero (positive) and the
entire mantissa is all zeroes. This corresponds to +0.
- If the mantissa is non-zero, the number is a denormal.
* If the exponent is in the range [1, 1022] the number is not a fastint
because the implicit mantissa bit corresponds to the number 0.5.
* If exponent is exactly 1023:
- The number is only a fastint if the stored mantissa is all zeroes.
This corresponds to +/- 1.
* If exponent is exactly 1024:
- The number is only a fastint if 51 lowest bits of the mantissa are all
zeroes (with the top bit either zero or one). This corresponds to the
numbers +/- 2 and +/- 3.
* Generalizing, if the exponent is in the range [1023,1069], the number is
a fastint if and only if:
- The lowest N bits of the mantissa are zero, where N = 52 - (exp - 1023),
with either sign.
- N can also be expressed as: N = 1075 - exp.
* If exponent is exactly 1070:
- The number is only a fastint if the sign bit is set (negative) and the
stored mantissa is all zeroes. This corresponds to -2^47. The positive
counterpart +2^47 does not fit into the fastint range.
* If exponent is [1071,2047] the number is never a fastint:
- For exponents [1071,2046] the number is too large to be a fastint.
- For exponent 2047 the number is a NaN or infinity depending on the
mantissa contents, neither a valid fastint.
Pseudocode 1
------------
The algorithm::
is_fastint(sgn, exp, mant):
if exp == 0:
return sign == 0 and mzero(mant, 52)
else if exp < 1023:
return false
else if exp < 1070:
return mzero(mant, 1075 - exp)
else if exp == 1070:
return sign == 1 and mzero(mant, 52)
else:
return false
The ``mzero`` helper predicate returns true if the mantissa given has its
lowest ``n`` bits zero.
Non-zero integers in the fastint range will fall into the case where a certain
computed number of low mantissa bits must be checked to be zero. As discussed
above, the algorithm should be optimized for the "input fits fastint" case.
Pseudocode 2
------------
Some rewriting::
is_fastint(sgn, exp, mant):
nzero = 1075 - exp
if nzero >= 52 and nzero <= 6: // exp 1023 ... exp 1069
// exponents 1023 to 1069: regular handling, common case
return mzero(mant, nzero)
else if nzero == 1075:
// exponent 0: irregular handling, but still common (positive zero)
return sign == 0 and mzero(mant, 52)
else if nzero == 5:
// exponent 1070: irregular handling, rare case
return sign == 1 and mzero(mant, 52)
else:
// exponents [1,1022] and [1071,2047], rare case
return false
C algorithm with a lookup table
-------------------------------
The common case ``nzero`` values are between [6, 52] and correspond to
mantissa masks. Compute a mask index instead as nzero - 6 = 1069 - exp::
duk_uint64_t mzero_masks[47] = {
0x000000000000003fULL, /* exp 1069, nzero 6 */
0x000000000000007fULL, /* exp 1068, nzero 7 */
0x00000000000000ffULL, /* exp 1067, nzero 8 */
0x00000000000001ffULL, /* exp 1066, nzero 9 */
/* ... */
0x0003ffffffffffffULL, /* exp 1025, nzero 50 */
0x0007ffffffffffffULL, /* exp 1024, nzero 51 */
0x000fffffffffffffULL, /* exp 1023, nzero 52 */
};
int is_fastint(duk_int64_t d) {
int exp = (d >> 52) & 0x07ff;
int idx = 1069 - exp;
if (idx >= 0 && idx <= 46) { /* exponents 1069 to 1023 */
return (mzero_masks[idx] & mant) == 0;
} else if (idx == 1069) { /* exponent 0 */
return (d >= 0) && ((d & 0x000fffffffffffffULL) == 0);
} else if (idx == -1) { /* exponent 1070 */
return (d < 0) && ((d & 0x000fffffffffffffULL) == 0);
} else {
return 0;
}
};
The memory cost of the mask table is 8x47 = 376 bytes. This can be halved
e.g. by using a table of 32-bit values with separate cases for nzero >= 32
and nzero < 32.
Unfortunately the expected case (exponents 1023 to 1069) involves a mask
check with a variable mask, so it may be unsuitable for direct inlining in
the most important hot spots.
C algorithm with a computed mask
--------------------------------
Since this algorithm only runs outside the proper fastint "fast path" it
may be more sensible to avoid a memory tradeoff and compute the masks::
int is_fastint(duk_int64_t d) {
int exp = (d >> 52) & 0x07ff;
int shift = exp - 1023;
if (shift >= 0 && shift <= 46) { /* exponents 1023 to 1069 */
return ((0x000fffffffffffffULL >> shift) & mant) == 0;
} else if (shift == -1023) { /* exponent 0 */
/* return (d >= 0) && ((d & 0x000fffffffffffffULL) == 0); */
return (d == 0);
} else if (shift == 47) { /* exponent 1070 */
return (d < 0) && ((d & 0x000fffffffffffffULL) == 0);
} else {
return 0;
}
};
C algorithm with a computed mask, unsigned
------------------------------------------
Using an unsigned 64-bit integer for the input::
int is_fastint(duk_uint64_t d) {
int exp = (d >> 52) & 0x07ff;
int shift = exp - 1023;
if (shift >= 0 && shift <= 46) { /* exponents 1023 to 1069 */
return ((0x000fffffffffffffULL >> shift) & mant) == 0;
} else if (shift == -1023) { /* exponent 0 */
/* return ((d & 0x800fffffffffffffULL) == 0); */
return (d == 0);
} else if (shift == 47) { /* exponent 1070 */
return ((d & 0x800fffffffffffffULL) == 0x8000000000000000ULL);
} else {
return 0;
}
};
C algorithm with 32-bit operations and a computed mask
------------------------------------------------------
For middle endian machines (ARM) this algorithm first needs swapping
of the 32-bit parts. By changing the mask checks to operate on 32-bit
parts the algorithm would work on more platforms and would also remove
the need for swapping the parts on middle endian platforms::
int is_fastint(duk_uint32_t hi, duk_uint32_t lo) {
int exp = (hi >> 20) & 0x07ff;
int shift = exp - 1023;
if (shift >= 0 && shift <= 46) { /* exponents 1023 to 1069 */
if (shift <= 20) {
/* 0x000fffff'ffffffff -> 0x00000000'ffffffff */
return (((0x000fffffUL >> shift) & hi) == 0) && (lo == 0);
} else {
/* 0x00000000'ffffffff -> 0x00000000'0000003f */
return (((0xffffffffUL >> (shift - 20)) & lo) == 0);
}
} else if (shift == -1023) { /* exponent 0 */
/* return ((hi & 0x800fffffUL) == 0x00000000UL) && (lo == 0); */
return (hi == 0) && (lo == 0);
} else if (shift == 47) { /* exponent 1070 */
return ((hi & 0x800fffffUL) == 0x80000000UL) && (lo == 0);
} else {
return 0;
}
};
Performance notes
-----------------
Coercing a double to an int64_t seems to be very slow on some platforms, so it
may be faster to get the fastint out of the IEEE double value with custom C
code. The code doesn't need to handle denormals, NaNs, etc, so it can be much
simpler than a full coercion routine.
There's a standard trick which is based on adding a double constant that
forces the mantissa to be shifted so that the integer value can be directly
extracted. See e.g.:
* http://stackoverflow.com/questions/17035464/a-fast-method-to-round-a-double-to-a-32-bit-int-explained
A similar trick is used in the number-to-double upgrade, see below.
Efficient check for number-to-double upgrade
============================================
Slow path code often needs to handle a number which may be either a fastint or
a double. The code needs to read the value efficiently as a double. To
minimize the slow path penalty, this check and conversion from a fastint to
a double (if necessary) needs to be fast.
The algorithm has two parts: (1) detecting that the value is a fastint, and
(2) converting a fastint into a double if necessary.
Checking for a fastint
----------------------
Checking for a fastint is easy:
* For packed duk_tval: if 16 highest bits are 0xfff1 (DUK_TAG_FASTINT) the
value is a fastint.
* For unpacked duk_tval: compare tag value similarly.
Trivial fastint-to-double conversion
------------------------------------
Converting a fastint into a double could be done by:
1. Sign extending the 48-bit value into a signed 64-bit value; the sign
extension can be achieved by two shifts.
2. Coercing the 64-bit value to a double.
Example::
duk_int64_t tmp = du.ull[DUK_DBL_IDX_ULL0];
tmp = (tmp << 16) >> 16; /* sign extend */
return (duk_double_t) tmp;
Unfortunately this is very slow, at least on some soft float platforms
where this was tested on.
Alternate fastint-to-double conversion
--------------------------------------
Because the input number range is 48-bit signed (and zero) the conversion can
be optimized a great deal. Let's first consider a positive value [1,2^47-1]:
* Construct an IEEE double with:
- Sign = 0
- Exponent field = 1023 + 52 = 1075
- Mantissa = the 52-bit fastint value aligned to the right of the field,
i.e. padded with zero bits on the left
* Because of the implicit leading 1-bit, the value represented is 2^52 +
fastint_value. Floating point subtract 2^52 to yield the final result.
The C code for this could be something like::
/* For fastint value [1,2^47-1]. */
du.ull[DUK_DBL_IDX_ULL0] = (duk_uint64_t) fastint_value |
(duk_uint64_t) 0x4330000000000000ULL;
du.d = du.d - 4503599627370496.0; /* 1<<52 */
return du.d;
Negative values need similar handling but the double sign bit needs to be set.
It's good to avoid sign extending the 48-bit value::
/* For fastint value [-2^47,-1]. */
du.ull[DUK_DBL_IDX_ULL0] = ((duk_uint64_t) (-fastint_value) &
(duk_uint64_t) 0x000fffffffffffffULL) |
(duk_uint64_t) 0xc330000000000000ULL;
du.d = du.d + 4503599627370496.0; /* 1<<52 */
return du.d;
Zero fastint is simply represented as an IEEE double with all bits zero, which
unfortunately needs a separate case.
In the concrete implementation the fastint_value might include the fastint
duk_tval tag and be masked out also for the positive number case.
Future work
===========
Fastint on platforms with no 64-bit integer type
------------------------------------------------
Currently fastint support can only be used if the platform/compiler has
support for a 64-bit integer type. This limitation could be removed by
implementing alternative fastint fast paths which only relied on 32-bit
arithmetic.
32-bit fastint
--------------
It might be worth investigating if a signed or unsigned 32-bit fastint
(instead of a signed 48-bit fastint) would be more useful. Fast path
arithmetic would certainly be faster.
The downside would be that some bit operations won't be possible: to
fully support all bit operations both signed and unsigned 32-bit values
is needed.
Optimize upgrade and downgrade
------------------------------
These operations are very important for performance so perhaps inline
assembler optimization would be useful for specific platforms, e.g. ARM.
The current C algorithms can also be optimized further.