duktape/doc/execution.rst


								=========

								Execution

								=========


								Overview

								========


								This document describes how Duktape manages its execution state.  Some details

								are omitted but the goal is to give an overall picture how execution proceeds,

								what state is involved, and what are the most important internal functions

								involved.


								The discussion is limited to a single Duktape heap as each Duktape heap is

								independent of other Duktape heaps.  At any time, only one native thread may

								be actively calling into a specific Duktape heap.


								Execution states and state overview

								===================================


								There are three conceptual execution states for a Duktape heap:


								* Idle


								* Executing a Duktape/C function


								* Executing an Ecmascript function


								This conceptual model ignores details like heap initialization and

								transitions from one state to another by "call handling".


								Execution state is contained mostly in three stacks:


								* Call stack: used to track function calls


								* Catch stack: used to track try-catch-finally and other catchpoints specific

								  to the bytecode executor


								* Value stack: contains the tagged values manipulated through the Duktape API

								  and in the bytecode executor


								In addition to these there are execution control variables in ``duk_hthread``

								and ``duk_heap``.


								Typical control flow

								====================


								Execution always begins from an idle state where no calls into Duktape are

								active and user application has control.  User code may manipulate the value

								stacks of Duktape contexts in this state without making any calls.  User code

								may also call ``duk_debugger_cooperate()`` for integrating debugger into the

								application event loop (or equivalent).


								Eventually user code makes a call into either a Duktape/C function or an

								Ecmascript function.  Such a call may be caused by an obvious API call like

								``duk_pcall()``.  It may also be caused by a less obvious API call such as

								``duk_get_prop()``, which may invoke a getter, or ``duk_to_string()`` which

								may invoke a ``toString()`` coercion method.


								The initial call into Duktape is usually handled using

								``duk_handle_call_(un)protected()`` which can handle a call from any state

								into any kind of target function.  Setting up a call involves a lot of state

								changes:


								* A setjmp catchpoint is needed for protected calls.


								* The call stack is resized if necessary, and an activation record

								  (``duk_activation``) is set up for the new call.


								* The value stack is resized if necessary, and a fresh value stack frame

								  is established for the call.  The calling value stack frame and the target

								  frame overlap for the call arguments, so that arguments on top of the

								  calling stack are directly visible on the bottom of the target stack.


								* An arguments object and an explicit environment record is created if

								  necessary.


								* Other small book-keeping (such as recursion depth tracking) is done.


								When a call returns, the state changes are reversed before returning to

								the caller.  If an error occurs during the call, a ``longjmp()`` will take

								place and will be caught by the current (innermost) setjmp catchpoint

								without tearing down the call state; the catchpoint will have to do that.


								If the target function is a Duktape/C function, the corresponding C function

								is looked up and called.  The C function now has access to a fresh value stack

								frame it can operate on using the Duktape API.  It can make further calls which

								get handled by ``duk_handle_call_(un)protected()``.


								If the target function is an Ecmascript function, the value stack is resized

								for the function register count (nregs) established by the compiler during

								function compilation; unlike Duktape/C functions the value stack is mostly

								static for the duration of bytecode execution.  Opcode handling may push

								temporaries on the value stack but they must always be popped off before

								proceeding to dispatch the next opcode.


								The bytecode executor has its own setjmp catchpoint.  If bytecode makes a

								call into a Duktape/C function it is handled normally using

								``duk_handle_call_(un)protected()``; such calls may happen also when the

								bytecode executor uses the value stack API for various coercions etc.


								If bytecode makes a function call into an Ecmascript function it is handled

								specially by ``duk_handle_ecma_call_setup()``.  This call handler sets up a

								new activation similarly to ``duk_handle_call_(un)protected()``, but instead

								of doing a recursive call into the bytecode executor it returns to the bytecode

								executor which restarts execution and starts executing the call target without

								increasing C stack depth.  The call handler also supports tail calls where an

								activation record is reused.


								Both Duktape and user code may use ``duk_safe_call()`` to make protected

								calls inside the current activation (or outside of any activations in the

								idle state).  A safe call creates a new setjmp catchpoint but not a new

								activation, so safe calls are not actual function calls.


								Threading limitations

								=====================


								Only one native thread may call into a Duktape heap at any given time.

								See ``threading.rst`` for more details.


								Bytecode executor

								=================


								Basic functionality

								-------------------


								* Setjmp catchpoint which supports yield, resume, slow returns, try-catch, etc


								* Opcode dispatch loop, central for performance


								* Executor interrupt which facilitates script timeout and debugger integration


								* Debugger support; breakpoint handling, checked and normal execution modes


								Setjmp catchpoint

								-----------------


								The ``duk_handle_call_protected()`` and ``duk_safe_call()`` catchpoints are only

								used to handle ordinary error throws which propagate out of the calling function.

								The bytecode executor setjmp catchpoint handles a wider variety of longjmp call

								types, and in many cases the longjmp may be handled without exiting the current

								function:


								* A slow break/continue uses a longjmp() so that if the break/continue crosses

								  any finally clauses, they get executed as expected.  Similarly 'with' statement

								  lexical environments are torn down, etc.


								* A slow return uses a longjmp() so that any finally clauses, 'with' statement

								  lexical environments, etc are handled appropriately.


								* A coroutine resume is handled using longjmp(): the Duktape.Thread.resume()

								  call adjusts the thread states (including their activations) and then uses

								  this longjmp() type to restart execution in the target coroutine.


								* A coroutine yield is handled using longjmp(): the Duktape.Thread.yield()

								  call adjusts the states and uses this longjmp() type to restart execution

								  in the target coroutine.


								* An ordinary throw is handled as in ``duk_handle_call_protected()`` with the

								  difference that there are both 'try' and 'finally' sites.


								Returns, coroutine yields, and throws may propagate out of the initial bytecode

								executor entry and outwards to whatever code called into the executor.


								Opcode dispatch loop and executor interrupt

								-------------------------------------------


								The opcode dispatch loop is a central performance critical part of the

								executor.  The dispatch loop:


								* Checks for an executor interrupt.  An interrupt can be taken for every

								  opcode or for every N instructions; the interrupt handler provides e.g.

								  script timeout and debugger integration.  This is performance critical

								  because the check occurs for every opcode dispatch.  See separate section

								  below on interrupt counter handling.


								* Fetches an instruction from the topmost activation's "current PC",

								  and increments the PC.  Managing the "current PC" is performance critical.

								  See separate section below on current PC handling.


								* Decodes and executes the opcode using a large switch-case.  The most

								  important opcodes are in the main opcode space (64 opcodes); more rarely

								  used opcodes are "extra" opcodes and need a double dispatch.


								* Usually loops back to execute further opcodes.  May also (1) call another

								  Duktape/C or Ecmascript function, (2) cause a longjmp, or (3) use

								  ``goto restart_execution`` to restart the executor e.g. after call stack

								  has been changed.


								Debugger support

								----------------


								Debugger support relies on:


								* Executor interrupt mechanism is needed to support debugging.


								* A precheck in ``restart_execution`` where debugging status and breakpoints

								  are checked.  Executor then either proceeds in "normal" or "checked"

								  execution.  Checked execution means running one opcode at a time, and

								  calling into the interrupt handler before each to see e.g. if a breakpoint

								  has been triggered.


								* There's some additional support outside the executor, e.g. call stack

								  unwinding code handles the "step out" logic.


								See ``debugger.rst`` for details.


								Call processing: duk_handle_call_(un)protected()

								================================================


								Call setup

								----------


								When handling a call, ``duk_handle_call_(un)protected()`` is given

								``num_stack_args`` which indicates how many arguments have been pushed

								on the current stack for the call.  The stack frame of the calling

								activation looks as follows::


								      top - num_stack_args - 2

								           |

								           |          top - num_stack_args

								           |               |

								           v               v

								  +-----+------+--------+------+-----+------+

								  | ... | func | 'this' | arg0 | ... | argN | <- top

								  +-----+------+--------+------+-----+------+


								To prepare the stack frame for the called function,

								``duk_handle_call_(un)protected()`` does the following:


								* If ``func`` is a bound function, follows the bound function chain until

								  a non-bound function is found.  While following the chain, the requested

								  ``this`` binding may be updated by the bound function, and arguments may be

								  prepended at the ``arg0`` point.


								* Coerces the ``this`` binding as specified in E5.  The ``this`` in the calling

								  stack frame is the caller requested ``this`` binding.  For instance, for a

								  property-based call (e.g. ``obj.method()``) this is the base object.  The

								  effective ``this`` binding may be coerced (for non-strict target functions)

								  or replaced during bound function handling.


								* Resolves the difference between arguments requested (target function

								  ``nargs``) and provided (``num_stack_args``) by filling in missing arguments

								  with ``undefined`` or discarding extra arguments so that exactly ``nargs``

								  arguments are present.  (Special handling is needed for vararg functions

								  where ``nargs`` indicates ``num_stack_args`` arguments are used as is.)


								* Finalizes the value stack "top":


								  - For Duktape/C target functions the top is set to ``nargs`` (or

								    ``num_stack_args`` for vararg functions).


								  - For Ecmascript target functions the top is first set to ``nargs``, wiping

								    any values above that, and then extended to ``nregs``.  Values above

								    ``nargs`` are filled with ``undefined``.  At the end the value stack frame

								    has ``nregs`` allocated and initialized entries, with ``[0, nargs-1]``

								    mapping to call arguments.


								* Creates a new lexical scope object if necessary; this step is postponed

								  when possible and done lazily only when actually necessary.


								* Creates a new activation, and switches the valstack bottom to the first

								  argument.


								The value stack looks as follows after call setup is complete and the new

								function is ready to execute (the example is for an Ecmascript target

								function)::


								     (-1)     0      1          nargs-1                   nregs - 1

								  +--------+------+------+-----+------+-----------+-----+-----------+

								  | 'this' | arg0 | arg1 | ... | argM | undefined | ... | undefined | <- top

								  +--------+------+------+-----+------+-----------+-----+-----------+


								The effective ``this`` binding for the function is always stashed right below

								the active value stack frame.  This interacts well with the calling convention

								where the requested ``this`` binding can be coerced in-place nicely, and the

								``this`` binding can also be accessed quickly.


								When doing tail calls, no stacks (value stack, call stack, catch stack) may

								grow in size; otherwise the point of cail talls would be defeated.  This is

								ensured as follows:


								* The value stack is manipulated so that the callee's first argument (``arg0``)

								  will be placed in the current activation's index 0 (value stack bottom).

								  The effective ``this`` binding is overwritten just below the current

								  activation's value stack bottom.


								* The call stack does not grow by virtue of reusing the current activation.


								* The catch stack does not grow because the Ecmascript compiler never emits

								  a tailcall if there is a catch stack; tail calls are not possible if a

								  catch stack exists, because e.g. ``try`` and ``finally`` must be processable.

								  Hence, ``duk_handle_call_(un)protected()`` simply asserts for this condition.


								Call cleanup after a successful call

								------------------------------------


								The C return value of the called Duktape/C function indicates how many return

								values are on the value stack, with negative values indicating an error which

								is thrown by call handling (this is a shorthand for throwing errors).


								To clean up after a call:


								* The call stack and catch stack are unwound, and a best effort shrink check

								  is done.  If shrinking is attempted and it fails, the error is ignored.


								* The value stack is restored to the caller's configuration.  The return value

								  is moved into its expected position (same as ``func`` on the input stack).

								  Value stack top is configured so that the return value is at the stack top

								  (for Duktape/C callers) or so that the stack top is at ``nregs`` (for

								  Ecmascript callers).  A value stack shrink (or grow) check is done; shrink

								  errors should be ignored silently.


								* Other book-keeping variables are restored to their entry values, e.g.:

								  call recursion depth, bytecode executor instruction pointer, thread state,

								  current thread, etc.


								Call cleanup after a failed call

								--------------------------------


								When an error is thrown it is caught by the nearest ``setjmp`` catch point.

								If that catch point is in ``duk_handle_call_protected()`` the processing is

								quite similar to success handling except that multiple call stack and catch

								stack frames are potentially unwound:


								* Restore the previous ``setjmp`` catchpoint so that any errors thrown during

								  call cleanup are propagated outwards to avoid recursion into the same

								  handler.  Note, however, that the error handling code path should never

								  actually throw further errors -- doing so would break protected call

								  semantics.


								* The call stack and catch stack are unwound, and a best effort shrink check

								  is done.


								* The value stack is configured as for successful calls, except that the error

								  thrown is left on the value stack instead of a return value.


								* Other book-keeping variables are restored to their entry values.


								If there's no catcher for the error the uncaught error causes the fatal error

								handler to be called.  None of the stacks are unwound, and since the entry

								values for various book-keeping variables are lost, there's no way to properly

								unwind the call state afterwards.  This is OK because fatal errors are not

								recoverable and there's no way to resume execution if a fatal error occurs.

								It should be possible to free the Duktape heap normally but this is of little

								use because it's not safe to continue execution after a fatal error in general.


								Managing heap->curr_thread

								--------------------------


								The current thread is managed in several places:


								* Call handling saves and restores ``heap->curr_thread`` whose previous value

								  may be different from the call thread when an initial call is made, i.e.

								  previous value is ``NULL``.


								* Bytecode executor longjmp handler ultimately handles each coroutine resume

								  and yield operation.  The longjmp handler will update ``heap->curr_thread``

								  as a resume enters a thread and when a yield exits a thread.


								* As a result, the setjmp catch point of ordinary call handling doesn't need

								  to unwind multiple levels of resumers: it just needs to restore the previous

								  value in case it was ``NULL``.


								Current limitations in call cleanup

								-----------------------------------


								As of Duktape 1.4.0 the error handling path is not completely free of errors

								in out-of-memory situations:


								* Value stack may need to be grown during call cleanup.  This will be fixed

								  so that value stack is never shrunk in call setup so that there's no need

								  to grow it in cleanup.


								* Unwinding activations causes lexical scope objects to be allocated which

								  may fail and propagate an error from error handling.  This needs to be fixed

								  e.g. so that the scope object is preallocated, see: https://github.com/svaarala/duktape/issues/476.


								Misc notes

								----------


								* The value stack doesn't hold all the internal state relevant for an

								  activation.  Some state, such as active environment records (``lex_env``

								  and ``var_env``) are held in the ``duk_activation`` call stack structure.


								Value stack management

								======================


								One value stack per thread

								--------------------------


								A thread has a single value stack, essentially an array of tagged values,

								which is shared by the activations in the call stack.  Each activation has

								a set of registers indexed relative to "frame bottom", starting from zero,

								mapped to the range [regbase, regtop[ in the value stack.  The register ranges

								of activations may and often do overlap (see call handling discussion).

								For instance, function call arguments prepared by the caller are used directly

								by the callee.


								The value stack can be thought of as follows::


								  size ->    _

								            : :    [0,size[    allocated range

								            : :    [top,size[  allocated, initialized to undefined, ignored by GC

								            : :    [0,top[     active range, must be initialized for GC

								  top ->    :_:

								            ! ! -.

								            ! !  !-- current activation

								            ! !  !

								  bottom -> !_! -'

								            ! !

								            ! !

								            ! !

								            ! !

								  0 ->      !_!


								There are several possible policies for values above "top".  The current

								policy is based on concrete performance measurements, and is as follows:


								* Values above "top" are not considered reachable to GC.


								* Values above "top" are initialized to "undefined" (DUK_TAG_UNDEFINED).

								  Whenever the "top" is decreased, previous values are set to undefined.


								Overlap between activations

								---------------------------


								Example of value stack overlap for two Ecmascript activations during a

								function call::


								  size ->    _

								            : :    [0,size[    allocated range

								            : :    [top,size[  allocated, initialized to undefined, ignored by GC

								            : :    [0,top[     active range, must be initialized for GC

								  top ->    :_:

								            !=! -.

								            !=!  !

								            !=!  !-- activation 2

								            !#!  !  -.

								  bottom -> !#! -'   !-- activation 1

								            !:!      !

								            !:!     -'

								            ! !

								  0 ->      !_!


								The callee's activation (activation 2 in the figure) may also be smaller

								than the caller's activation::


								  size ->    _

								            : :    [0,size[    allocated range

								            : :    [top,size[  allocated, initialized to undefined, ignored by GC

								            : :    [0,top[     active range, must be initialized for GC

								            : :

								            : :

								            ::: -.

								            :::  !-- activation 1

								  top ->    :::  !

								            !#!  !  -.

								            !#!  !   !-- activation 2

								  bottom -> !#!  !  -'

								            !:!  !

								            !:! -'

								            ! !

								  0 ->      !_!


								When the callee returns, call handling will restore the value stack frame

								to the size expected by the caller.  Values above the entries used for

								call handling will be reinitialized to ``undefined``.


								Call handling will also ensure that the reserved size for the value stack

								never decreases as a result of the call, even if the caller has a much

								smaller value stack frame.  This is important for the value stack size

								guarantees provided by e.g. ``duk_require_stack()``.


								Note that there is nothing in the value stack model or the execution model

								in general which requires activations to share registers for parameter

								passing.  It is just a convenient thing to do especially for

								Ecmascript-to-Ecmascript calls: it minimizes value stack growth, minimizes

								unnecessary copying of arguments (which is pointless because the caller will

								never rely on the argument values after a call anyway).


								When an Ecmascript function with a very large value stack frame calls

								a function with a very small value stack frame, a lot of value stack

								resize / wipe mechanics will happen.  It might be useful to avoid the

								register overlap in such cases to improve performance.


								Growing and shrinking

								---------------------


								The value stack allocation size grows and shrinks as required by the active

								range, which changes e.g. during function calls.  Some hysteresis is applied

								to minimize memory allocation activity when the value stack changes active

								size.  Note that when the value stack grows or shrinks, it is reallocated and

								its base pointer may change, which invalidates any outstanding pointers to

								values in the stack.  For this reason, all persistent execution state refers

								to registers and value stack entries by index, not by memory pointer.


								Whenever there is a risk of a garbage collector run (either directly or

								indirectly through an error, a finalizer run, etc) all the entries in the

								[0,top[ range of the value stack must be initialized and correctly reference

								counted: all active ranges of reachable threads are considered GC roots.  The

								compiler and the executor should wipe any unused value stack entries as soon

								as the values are no longer needed: otherwise the values will be reachable

								for the GC and will prevent garbage collection.  This is easy to do e.g.

								when a function call returns (just wipe the entire range of registers used

								by the function) but is more difficult for a function which runs forever.


								When Ecmascript functions are compiled, the compiler keeps track of how many

								registers are needed by the opcodes comprising the compiled bytecode, and

								this value is stored in the ``nregs`` entry of a compiled function.  While

								the Ecmascript function is executing, we know that *all* register accesses

								will be to valid and initialized parts of the value stack, so no grow/shrink

								or other sanity checks are necessary while the function is executing.  This

								does not mean that all the ``nregs`` will always be used, and any unused

								registers at the top of the activation record's register range can be reused

								during e.g. function calls.


								The value stack is handled quite differently for C functions, which use a

								traditional stack model (this is similar to how Lua manages its value stack).

								Value stack grow/shrink checks are needed whenever pushing and popping values,

								and the number of value stack entries needed is not known beforehand.

								Arguments to C functions are placed on top of the initial C activation record

								(starting from register 0).  A possible return value is left by the C code at

								the top of the stack, not necessarily at position 0.  The return value of the

								C function indicates whether a return value is intended or not; if not, the

								return value defaults to ``undefined``.


								Managing executor interrupt

								===========================


								The executor interrupt counter is currently tracked in

								``thr->interrupt_counter``.  This seems to work well because ``thr`` is a

								"hot" variable.


								Another alternative would be to track the counter in an executor local

								variable.  Error handling and other code paths jumping out of the executor

								need to work similarly to how stack local ``curr_pc`` is handled.


								Managing current PC

								===================


								Current approach

								----------------


								The current solution in Duktape 1.3 is to maintain a direct bytecode pointer

								in each activation, and to keep a "cached copy" of the topmost activation's

								bytecode pointer in a bytecode executor local variable ``curr_pc``.  A pointer

								to the ``curr_pc`` in the stack frame (whose type is ``duk_instr_t **``) is

								stored in ``thr->ptr_curr_pc`` so that when control exits the opcode dispatch

								loop (e.g. when an error is thrown) the value in the stack frame can be read

								and synced back into the topmost activation's ``act->curr_pc``.


								Consistency depends on the compiler doing correct aliasing analysis, and

								writing back the ``curr_pc`` value to the stack frame before any operation

								that may potentially read it through ``thr->ptr_curr_pc``.  Using ``volatile``

								would be safer but in practical testing it eliminates the performance benefit

								entirely.


								For the most part the bytecode executor can keep on dispatching opcodes

								using ``curr_pc`` without copying the pointer back to the topmost activation.

								Careful management of ``curr_pc`` and ``thr->ptr_curr_pc`` are needed in the

								following situations:


								* Call handling must (1) store/restore the current ``thr->ptr_curr_pc`` value,

								  (2) sync the ``curr_pc`` if ``thr->ptr_curr_pc`` is non-NULL, (3) set the

								  ``thr->ptr_curr_pc`` to NULL to avoid any code using it with an incorrect

								  activation (not matching what ``curr_pc`` was initialized from).  This

								  ensures that any side effects in the executor, such as DECREF causing a

								  finalizer call or a property read causing a getter call, are handled

								  correctly without the executor syncing the ``curr_pc`` at every turn.  This

								  is quite important because there are a lot of potential side effects in the

								  executor opcode loop.


								* If any code depends on ``duk_activation`` structs (``act->curr_pc`` in

								  particular) being correct, ``curr_pc`` must be synced back.  For example:

								  executor interrupt, debugger handling, and error augmentation need to see

								  synced state.


								* The ``curr_pc`` must be synced back **and** ``thr->ptr_curr_pc`` must be

								  NULLed before a longjmp that (potentially) causes a call stack unwind.

								  The NULLing is important because **any** call stack unwind may have side

								  effects due to e.g. finalizers for values in the unwound call stack being

								  called.  If ``thr->ptr_curr_pc`` was still set at that time, call handling

								  would sync ``curr_pc`` to the topmost activation, which wouldn't be the

								  same activation as intended.


								* NULLing of ``thr->ptr_curr_pc`` is also required for longjmps which are

								  purely internal to the bytecode executor.  This is important because the

								  seemingly internal longjmps may propagate outwards, may cause side effects,

								  etc, all of which demand that ``thr->ptr_curr_pc`` be NULL at the time.

								  Once the longjmp has been handled, the executor should reinitialize

								  ``thr->ptr_curr_pc`` if bytecode execution resumes.


								* Whenever the bytecode executor does a ``goto restart_execution;`` the

								  ``curr_pc`` must be synced back even if the activation hasn't changed:

								  the restart code will look up the topmost activation's ``act->curr_pc``

								  which must be up to date.


								Syncing the pointer back unnecessarily or multiple times is safe in general,

								so there's no need to ensure there's exactly one sync for a certain code path.


								Function bytecode is behind a stable pointer, so there are no realloc or

								other side effect concerns with using direct bytecode pointers.  Because

								the function being executed is always reachable, a borrowed pointer can

								be used.


								This approach is error prone, but it is worth the performance difference of

								the alternatives.  This method of dispatch improves dispatch performance by

								about 20-25% over Duktape 1.2.


								Some alternatives

								-----------------


								* Duktape 1.3: maintain a direct bytecode pointer in each activation, and a

								  "cached" copy of the topmost activation's bytecode pointer in a local

								  variable of the executor.  Whenever something that might throw an error

								  is executed, write the pointer back to the current activation using

								  ``thr->ptr_curr_pc`` which points to the stack frame location containing

								  ``curr_pc``.


								* Duktape 1.2: maintain all PC values as numeric indices (not pointers and

								  not pre-multiplied by bytecode opcode size).  The current PC is always

								  looked up from the current activation.


								* Same as Duktape 1.3 behavior but maintain a cached copy of the topmost

								  activation's bytecode pointer in ``thr->curr_pc``.  The copy back operation

								  is needed but doesn't need to peek into the bytecode executor stack frame.

								  This works quite well because ``thr`` is a "hot" variable.  However, the

								  stack local ``curr_pc`` used in Duktape 1.3 is faster.


								* Use direct bytecode pointers in activations, keep a pointer to the current

								  activation in the executor, and use ``act->curr_pc`` for dispatch.  There's

								  no need for a copy back operation because activation states are always in

								  sync.  This is faster than the Duktape 1.2 approach, but significantly

								  slower than the ``thr->curr_pc`` or the Duktape 1.3 approach (part of that

								  is probably because there's more register pressure).


								Comparison between curr_pc alternatives

								---------------------------------------


								The current Duktape 1.3 approach is a bit error prone because of the need to

								sync the executor local ``curr_pc`` back to ``act->curr_pc`` in multiple code

								paths.  Another alternative would be to dispatch using ``act->curr_pc``

								directly.  While that is faster than Duktape 1.2, it is significantly slower

								than dispatching using executor local ``curr_pc`` (or ``thr->curr_pc``).


								The measurements below are using ``gcc -O2`` on x64::


								    # Duktape 1.3, dispatch using executor local variable curr_pc

								    $ sudo nice -20 python util/time_multi.py --count 10 --mode all --verbose ./duk.O2.local_pc tests/perf/test-empty-loop.js

								    Running: 2.180000 2.170000 2.180000 2.290000 2.180000 2.200000 2.190000 2.190000 2.220000 2.200000

								    min=2.17, max=2.29, avg=2.20, count=10: [2.18, 2.17, 2.18, 2.29, 2.18, 2.2, 2.19, 2.19, 2.22, 2.2]


								    # Duktape 1.2, dispatch using a numeric PC index

								    $ sudo nice -20 python util/time_multi.py --count 10 --mode all --verbose ./duk.O2.123 tests/perf/test-empty-loop.js

								    Running: 3.100000 3.100000 3.120000 3.120000 3.160000 3.300000 3.370000 3.410000 3.370000 3.390000

								    min=3.10, max=3.41, avg=3.24, count=10: [3.1, 3.1, 3.12, 3.12, 3.16, 3.3, 3.37, 3.41, 3.37, 3.39]


								    # Alternative; dispatch using thr->curr_pc

								    $ sudo nice -20 python util/time_multi.py --count 10 --mode all --verbose ./duk.O2.thr_pc tests/perf/test-empty-loop.js

								    Running: 2.310000 2.330000 2.310000 2.300000 2.400000 2.290000 2.310000 2.290000 2.300000 2.300000

								    min=2.29, max=2.40, avg=2.31, count=10: [2.31, 2.33, 2.31, 2.3, 2.4, 2.29, 2.31, 2.29, 2.3, 2.3]


								    # Alternative; dispatch using act->curr_pc

								    $ sudo nice -20 python util/time_multi.py --count 10 --mode all --verbose ./duk.O2.act_pc tests/perf/test-empty-loop.js

								    Running: 2.590000 2.580000 2.600000 2.600000 2.600000 2.660000 2.600000 2.640000 2.860000 2.860000

								    min=2.58, max=2.86, avg=2.66, count=10: [2.59, 2.58, 2.6, 2.6, 2.6, 2.66, 2.6, 2.64, 2.86, 2.86]


								Accessing constants

								===================


								The executor stores a copy of the ``duk_hcompiledfunction`` constant table

								base address into a local variable ``consts``.  This reduces code footprint

								and performs better compared to reading the consts base address on-the-fly

								through the function reference.  Because the constants table has a stable

								base address, this is easy and safe.


								Accessing registers

								===================


								The executor currently accesses the stack frame base address (needed to read

								registers) through ``thr`` as ``thr->valstack_bottom``.  This is reasonably

								OK because ``thr`` is a "hot" variable.


								The register base address could also be copied to a local variable as is done

								for constants.  However, ``thr->valstack_bottom`` is not a stable address and

								may be changed by any side effect (because any side effect can cause a value

								stack resize, e.g. if a finalizer is invoked).


								If a local variable were to be used, it would need to be updated when the

								value stack is resized.  It's not certain if overall performance would be

								improved.  This was postponed to Duktape 1.4:


								* https://github.com/svaarala/duktape/issues/298