Add internal helpers to deal with undefined behavior cases explicitly.
When DUK_USE_ALLOW_UNDEFINED_BEHAVIOR is set, skip the (almost always
unnecessary) explicit checks to improve footprint and performance.
Also split duk_util_misc.c into a few better scoped files.
* Change duk_bool_to to duk_small_uint_t from duk_small_int_t. This may
cause some sign warnings in calling code.
* Reject attempt to unpack an array-like value whose length is 2G or over;
previously was not checked explicitly, and the length was cast to duk_idx_t
with a sign change and the unpack would then later fail. Now it fails with
a clean RangeError.
* Add wrap check for Node.js Buffer.concat().
* API DUK_TYPE_xxx, DUK_TYPE_MASK_xxx, flag constants etc are now unsigned.
Both duk_hthread and duk_context typedefs resolve to struct duk_hthread
internally. In external API duk_context resolves to struct duk_hthread
which is intentionally left undefined as the struct itself is not
dereferenced. Change internal code to use duk_hthread exclusively which
removes unnecessary and awkward thr <-> ctx casts from internals.
The basic guidelines are:
* Public API uses duk_context in prototype declarations. The intent is to
hide the internal type, and there's already a wide dependency on the
type name.
* All internal code, both declarations and definitions, use duk_hthread
exclusively. This is done even for API functions, i.e. an API function
declared as "void duk_foo(duk_context *ctx);" is then defined as
"void duk_foo(duk_hthread *thr);".
When the flag is set, there is either no subclass C struct for the
duk_hobject, or there is a subclass C struct but there are no references
needing DECREF/marking in the struct.
This allows DECREF and mark-and-sweep to handle duk_hobjects with less
overhead for the common cases of plain objects and arrays (and some other
less commonly collected structs like duk_hnatfunc).
Also change Duktape.Thread.prototype internal class from Thread to Object:
with the other changes internal code now assumes that if an object's class
is Thread, it has the duk_hthread memory layout which wouldn't be the case
for Duktape.Thread.prototype.
* Comment on duk_buffer_to_string() safety w.r.t. potentially pushing
Symbol values.
* Went through all duk_push_(l)string() call sites too.
* Minor footprint optimization for pushing empty strings (use more
compact internal helper).
* Strings with 0xFF byte prefix are considered special symbols: they have
typeof "symbol" but still mostly behave as strings (e.g. allow ToString)
so that existing code dealing with internal keys, especially inside
Duktape, can work with fewer changes.
* Strings with 0x80 byte prefix are global symbols, e.g. Symbol.for('foo')
creates the byte representatio: 0x80 "foo"
* Strings with 0x81 byte prefix are unique symbols; the 0x81 byte is followed
by the Symbol description, and an internal string component ensuring
uniqueness is separated by a 0xFF byte (which can never appear anywhere in
an extended UTF-8 string). The unique suffix is up to Duktape internals,
currently two 32-bit counters are used. For example:
0x81 "mySymbol" 0xFF "0-17".
* Well-known symbols use the 0x81 prefix but lack a unique suffix, so their
format is 0x81 <description> 0xFF.
* ES6 distinguishes between an undefined symbol description and an empty
string symbol description. This distinction is not currently visible via
Ecmascript bindings but may be visible in the future. Append an extra
0xFF to the unique suffix when the description is undefined, i.e.
0x81 0xFF <unique suffix> 0xFF.
Saves a few hundred bytes of footprint:
* duk_dup_0() = duk_dup(ctx, 0), duk_dup_1() = duk_dup(ctx, 1), etc.
* duk_dup_m2() = duk_dup(ctx, -2), etc.
* duk_dup_m1() is not added, because duk_dup_top() is the same thing
Change handling of plain buffers so that they behave like ArrayBuffer
instances to Ecmascript code, with limitations such as not being
extensible and all properties being virtualized. This simplifies
Ecmascript code as plain buffers are just lightweight ArrayBuffers
(similarly to how lightfuncs appear as function objects). There are
a lot of small changes in how the built-in objects and methods, and
the C API deals with plain buffer values.
Also make a few small changes to plain pointer and lightfunc handling
to improve consistency with how plain buffers are now handled.
Improve readability by doing the following renames:
* duk_hcompiledfunction -> duk_hcompfunc
* duk_hnativefunction -> duk_hnatfunc
* duk_hbufferobject -> duk_hbufobj
Corresponding renames for all caps defines.
* Add 'at foo' to indicate function name
* Put file/line into parentheses for better readability
* Use [anon] for anymous functions, which is better than an empty string but
distinguishable from a function actually named 'anon'
* Rename tailcalled flag to tailcall
* Switch from tab indent to 4 space indent
Emscripten depends on a Function's .toString() having a certain format
(beyond what is guaranteed by the E5 specification). In particular, it
won't parse Duktape's current format where the function body is replaced
with a comment indicating the function type, e.g.:
function foo() {/* ecmascript */}
Change this format to indicate the function type using a string literal
(similarly to how directive prologue works):
function foo() {"ecmascript"}
This is equally informational and will work directly with Emscripten.
Also change handling for anonymous functions: substituting "anon" for an
anonymous function in an Ecmascript function's .toString() output is a bit
misleading. It also causes an Emscripten issue when a source regexp expects
there to be no name for an anonymous function in the .toString() output.
Use the first traceback entry which actually has a .fileName to provide
virtual .fileName and .lineNumber. This works better for expressions
such as 'Duktape.enc(123)' which currently get 'undefined' as .fileName
(comes from the enc() function, which has no .fileName property).
Previous behavior was to ignore the setter call, so that attempting a write
would be a no-op: "err.fileName = 'dummy';" would do nothing. Because the
inherited setter is configurable, the user could still create an own property
of the same name using Object.defineProperty() or duk_def_prop() but this is
not very intuitive nor very well documented.
The updated behavior is to make the .fileName, .lineNumber, and .stack setter
make the duk_def_prop() call automatically so that writing to these properties
will seem to work as if the property was not an inherited accessor.
This matches how e.g. V8 and Spidermonkey work, and also matches how Duktape
works when tracebacks are disabled and .fileName and .lineNumber are concrete
own properties.
Add a C API binding for Object.defineProperty(): duk_def_prop().
In addition to Object.defineProperty() features, the API call provides a
"force" flag which allows properties to be added to non-extensible objects
and non-configurable properties to be changed (except virtual properties
which are immutable).
Because the name duk_def_prop() conflicts with internal calls, rename them
from duk_def_prop*() to duk_xdef_prop*(). This rename also makes it clearer
that the internal duk_xdef_prop*() calls have non-compliant, internal
semantics.
Also reimplement Object.defineProperty() and Object.defineProperties() (and
duk_def_prop()) so that they share the same internal helpers, and there is
no need for a temporary property descriptor object which is unnecessary churn.
Detailed changes:
- New helper to prepare (validate and normalize) property descriptors
- New helper to implement Object.defineProperty() internals, leaving
out validation of the property descriptor
- Reimplement Object.defineProperty() using the new helpers
- Reimplement Object.defineProperties() using the new helpers
- Reimplement duk_define_property() using the new helpers, so that a
temporary property descriptor object is no longer created
- Add support for "force" flag to Object.defineProperty()
* Fix outstanding FIXME issues for lightfunc semantics.
* Improve API and Ecmascript testcases to match.
* Clarify lightfunc limitations, e.g. finalizer limitations.
* Guide and API documentation changes for lightfuncs.
* Fix compile warning: duk_str_not_object unused.
A lot of changes to add preliminary lightfunc support:
* Add LIGHTFUNC tagged type to duk_tval.h and API.
* Internal changes for preliminary to support lightfuncs in call handling
and other operations (FIXMEs left in obvious places where support is
still missing after this commit)
* Preliminary Ecmascript and API testcases for lightfuncs
Detailed notes:
* Because magic is signed, reading it back involves sign extension which is
quite verbose to do in C. Use macros for reading the magic value and other
bit fields encoded in the flags.
* Function.prototype.bind(): the 'length' property of a bound function now
comes out wrong. We could simply look up the virtual 'length' property
even if h_target is NULL: no extra code and binding is relatively rare in
hot paths. Rewrite more cleanly in any case.
* The use flag DUK_USE_LIGHTFUNC_BUILTINS controls the forced lightfunc
conversion of built-ins. This results in non-compliant built-ins but
significant memory savings in very memory poor environments.
* Reject eval(), Thread.yield/resume as lightfuncs. These functions have
current assertions that they must be called as fully fledged functions.
* Lightfuncs are serialized like ordinary functions for JSON, JX, and JC
by this diff.
* Add 'magic' to activation for lightfuncs. It will be needed for lightweight
functions: we don't have the duk_tval related to the lightfunc, so we must
copy the magic value to the activation when a call is made.
* When lightfuncs are used as property lookup base values, continue property
lookup from the Function.prototype object. This is necessary to allow e.g.
``func.call()`` and ``func.apply()`` to be used.
* Call handling had to be reworked for lightfuncs, especially how bound
function chains are handled. This is a relatively large change but is
necessary to support lightweight functions properly in bound function
resolution.
The current solution is not ideal. The bytecode executor will first try an
ecma-to-ecma call setup which resolves the bound function chain first. If
the final, unbound function is not viable (a native function) the call setup
returns with an error code. The caller will then perform a normal call.
Although bound function resolution has already been done, the normal call
handling code will re-do it (and detect there is nothing to do).
This situation could be avoided by decoupling bound function handling and
effective this binding computation from the actual call setup. The caller
could then to do this prestep first, and only then decide whether to use an
ecma-to-ecma call or an ordinary heavyweight call.
Remove duk__find_nonbound_function as unused.
* Use indirect magic to allow LIGHTFUNCs for Date. Most of the built-in
functions not directly eligible as lightfuncs are the Date built-in methods,
whose magic values contain too much information to fit into the 8-bit magic
of a LIGHTFUNC value.
To work around this, add an array (duk__date_magics[]) containing the
actual control flags needed by the built-ins, and make the Date built-in
magic value an index into this table. With this change Date built-ins are
successfully converted to lightfuncs.
Testcase fixes:
- Whitespace fixes
- Print error for indirect eval error to make diagnosis easier
- Fix error string to match errmsg updated in this branch
The tracedata contains function references to functions in the current
call stack. If these functions are not accessible from a sandboxed
environment (through the global object), sandboxed code may gain access
to functions it's not intended to have access to.
Lots of minor string format cleanups (casts mainly).
Add a special check to detect NULL pointers with %s/%p formats in debug
logging to avoid calling the platform sprintf() with a NULL value. This
is especially important for %s + NULL, which is not portable and may crash
on some platforms. Debug logging will now format these specially as "NULL"
and avoid calling the platform for formatting, so debug log call sites don't
need to be careful with NULL pointers any more.