* Finalize debug log support for pointer compressed builds. Pointer
compressed builds can now debug log.
* Fix refcount leak in Uint8Array 'get'.
* Fix assertion failure (and underlying bug) for chained Proxy.
* Fix build failure when using assertions and no refcounts.
* Add stack top entry/exit assertions to duk_prop_get.c and
duk_prop_getown.c.
* Remove lazy charlen support. Since we need to WTF-8 sanitize the entire
input string, charlen can be computed while validating (avoiding extra
book-keeping for ASCII eventually).
* Improve WTF-8 search forwards/backwards performance (no substring operations)
when the search string is valid UTF-8. Use reference implementation for
non-UTF-8 still, to be optimized later.
* Minor testcase improvements.
Switch to using WTF-8 for duk_hstring string representation. The main
differences to previous extended CESU-8/UTF-8 are: (1) valid surrogate
pairs are automatically combined to UTF-8 on string intern while invalid
surrogate characters are encoded in CESU-8, and (2) ECMAScript code always
sees surrogate pairs for non-BMP characters.
Together, these make it more natural to work with non-BMP strings for both
ECMAScript (which no longer sees extended codepoints as before) and native
code (which now sees valid UTF-8 for non-BMP whenever possible).
Internally the main change is in string interning which now always sanitizes
input strings (but no Symbols) to WTF-8. Also all call sites where the byte
representation of strings are dealt with need fixing. WTF-8 leads to some
challenges because it's no longer possible to e.g. find a substring with a
naive byte compare: surrogate characters may either appear directly (CESU-8)
or baked into a non-BMP UTF-8 byte sequence.
The main places where this needs complex handling include:
* charCodeAt / codePointAt
* Extracting a substring
* String .replace()
* String .startsWith() and .endsWith()
* String .split() and search functions (like .indexOf())
* RegExp matching
* String cache behavior
This commit fixes all the necessary sites with minimal baseline implementations
which are in some cases much slower than the previous CESU-8 ones. Further work
is needed to optimize the WTF-8 variants to perform close to CESU-8.
* Rework duk_heaphdr and subclass assertions into functions to
reduce debug build size.
* Add explicit object validity assert passes to mark-and-sweep.
This allows detection of invalid internal structures especially
when used with GC torture.
* Rename assertion macros for consistency, e.g. from
DUK_ASSERT_HSTRING_VALID to DUK_HSTRING_ASSERT_VALID.
* Automatically pin C literals interned into heap strings. Or if the
literal maps to an already interned string, pin it too. Pinning is
implemented using a duk_hstring flag and a one-off refcount bump.
Mark-and-sweep avoids sweeping pinned strings based on the flag.
* Add a lookup cache for quickly mapping a C literal address (which is
assumed stable) into a duk_hstring pointer. Once a mapping has been
formed, it never needs to be invalidated because the duk_hstring is
always pinned if the cache is used. Only heap destruction will free
the pinned duk_hstrings.
* More internal call site conversion for literals.
* Wording trivia.
Use a single pause flags field to implement Resume, StepInto, StepOver,
and StepOut. This opens up possibilities for more Resume options, like
explicit control over whether to pause on caught vs. uncaught error.
Change StepOver, StepInto, and StepOut behavior when current activation
has no line information. Previously the commands were silently ignored
in this tate. The updated behavior is to ignore the line-based pause
trigger but obey the others, e.g. StepInto will pause on function entry,
function exit, and an error thrown past the current function.
* Change duk_bool_to to duk_small_uint_t from duk_small_int_t. This may
cause some sign warnings in calling code.
* Reject attempt to unpack an array-like value whose length is 2G or over;
previously was not checked explicitly, and the length was cast to duk_idx_t
with a sign change and the unpack would then later fail. Now it fails with
a clean RangeError.
* Add wrap check for Node.js Buffer.concat().
* API DUK_TYPE_xxx, DUK_TYPE_MASK_xxx, flag constants etc are now unsigned.
Remove thr->callstack as a monolithic array and replace it with a linked list
of duk_activations. thr->callstack_curr is the current call (or NULL if no
call is in progress), and act->parent chains to a previous call or NULL.
thr->callstack_top is kept because it's needed by some internals at present;
it may be removed in the future.
Tweak mark-and-sweep so that if finalizers are present (heap->finalize_list
is not NULL), rescue decisions are postponed (free decisions are not).
In concrete terms this means that objects normally rescued keep their
FINALIZED flag so that their finalizer won't be called again if the object
turns out to be unreachable in a later run.
* Replace the two alternative algorithms with a single one which works for
both desktop and low memory cases.
* Basic algorithm is a hash table with size 2^N, hash mask is simply
(size - 1), e.g. if size is 0x100, mask is 0xFF. duk_hstring has a 'next'
pointer (single linked list) for chaining strings mapping to the same
slot.
* Strings with 0xFF byte prefix are considered special symbols: they have
typeof "symbol" but still mostly behave as strings (e.g. allow ToString)
so that existing code dealing with internal keys, especially inside
Duktape, can work with fewer changes.
* Strings with 0x80 byte prefix are global symbols, e.g. Symbol.for('foo')
creates the byte representatio: 0x80 "foo"
* Strings with 0x81 byte prefix are unique symbols; the 0x81 byte is followed
by the Symbol description, and an internal string component ensuring
uniqueness is separated by a 0xFF byte (which can never appear anywhere in
an extended UTF-8 string). The unique suffix is up to Duktape internals,
currently two 32-bit counters are used. For example:
0x81 "mySymbol" 0xFF "0-17".
* Well-known symbols use the 0x81 prefix but lack a unique suffix, so their
format is 0x81 <description> 0xFF.
* ES6 distinguishes between an undefined symbol description and an empty
string symbol description. This distinction is not currently visible via
Ecmascript bindings but may be visible in the future. Append an extra
0xFF to the unique suffix when the description is undefined, i.e.
0x81 0xFF <unique suffix> 0xFF.
Instead of having a separate panic concept which previously differed from
fatal error handling in that there was no context attached to the error,
use fatal errors also for call sites which previously used the panic handler.
Because these call sites are context-free (DUK_ASSERT() failures) simply call
the Duktape-wide default fatal error handler instead of the user fatal error
handler. For heap creation errors (self test failures) the udata is available;
for assertion it isn't and NULL is used instead.
Add a config option to replace the Duktape-wide fatal error handler; the
default one just segfaults on purpose, to avoid creating postability issues
by depending on e.g. abort().
Remove the error code from the fatal error function signature (it's mostly
pointless) and change the "ctx" argument to "udata" (heap userdata) which is
less confusing than an arbitrary context related to the heap (especially
because it's unsafe to actually use the "ctx" to e.g. call into the Duktape
API).
The fatal error signature change also affects the duk_fatal() API call, which
loses the error code argument.
When Duktape receives an AppRequest, it pushes all the dvalues in the
message to the value stack and calls the request callback. That callback
can process the message and optionally push its own values to the stack,
which will be sent back to the client as a reply.
It is also possible for the target to send application-specific notifys
to the client by calling duk_debugger_notify(). These will be received
by the client as an AppNotify message.
* Define duk_internal_exception, a plain exception class, wihch is used as
the value thrown for Duktape internal long control transfers. The value
intentionally does not inherit from std::exception so that it'd be as
unlikely as possible that user code would catch the internal exception
type; only a "catch (...)" would catch it.
* Replace DUK_SETJMP() if clauses with "try", and SETJMP() error handling
blocks with "catch (duk_internal_exception &exc)" blocks.
* Also add clauses for "catch (std::exception &exc)" and "catch (...)" to
catch C++ exceptions thrown by user code which are propagated to Duktape
try-catch blocks. Such exceptions are converted to API errors. For now
it's not supported for user code to propagate a C++ exception "through"
Duktape, as that would require some semantics changes to (native) protected
calls. Catching and converting such exceptions to API errors makes the
user code error apparent and easier to fix.
- Set FINALIZED only when actually running finalizer to prevent
a finalizer running twice unless explicitly rescued (and the
flag cleared by the rescuing mark-and-sweep or refzero code).
- Add a guard to avoid re-finalizing until FINALIZED is explicitly
cleared on rescue by either mark-and-sweep or refcounting.
- Prevent mark-and-sweep and refzero from running when heap is
destroyed and finalizers are forcibly executed.
- Add a mark-and-sweep flag to skip finalizers: move any finalizable
objects back to heap_allocated. This is needed for correct
finalizer handling in heap destruction.
- Add second finalizer argument; if true, it indicates that the
heap is being destroyed and rescuing an object is not possible.
Finalizer should therefore free all native resources without
relying on the finalizer to be called again.
- Add multiple finalizer rounds for heap destruction to deal with
finalizers which create further finalizable objects. Also add
a sanity limit for this process to catch runaway finalizers.
- Explicit Proxy check just before running finalizer: don't run
finalizer for Proxy objects even when call site is buggy.
Src fiexs
When the debugger is detached, Duktape will send out a notify,
DUK_DBG_CMD_DETACHING, before dropping the transport. The debug client
can look for this message to differentiate between an intentional detach
and a dropped connection.
GetLocals, GetVar, PutVar, and Eval will now accept an optional negative
callstack offset specifying the function activation to operate on. This
offset has the same semantics as the argument of Duktape.act(): -1 is
the topmost activation, -2 is its caller, etc.