duktape/website/guide/stacktypes.html


								<h1 id="stacktypes">Stack types</h1>


								<h2 id="type-table">Overview</h2>


								<div class="table-wrap">

								<table>

								<tr class="header"><th>Type</th><th>Type constant</th><th>Type mask constant</th><th>Description</th></tr>

								<tr><td><a href="#type-none">(none)</a></td><td>DUK_TYPE_NONE</td><td>DUK_TYPE_MASK_NONE</td><td>no type (missing value, invalid index, etc)</td></tr>

								<tr><td><a href="#type-undefined">undefined</a></td><td>DUK_TYPE_UNDEFINED</td><td>DUK_TYPE_MASK_UNDEFINED</td><td><code>undefined</code></td></tr>

								<tr><td><a href="#type-null">null</a></td><td>DUK_TYPE_NULL</td><td>DUK_TYPE_MASK_NULL</td><td><code>null</code></td></tr>

								<tr><td><a href="#type-boolean">boolean</a></td><td>DUK_TYPE_BOOLEAN</td><td>DUK_TYPE_MASK_BOOLEAN</td><td><code>true</code> and <code>false</code></td></tr>

								<tr><td><a href="#type-number">number</a></td><td>DUK_TYPE_NUMBER</td><td>DUK_TYPE_MASK_NUMBER</td><td>IEEE double</td></tr>

								<tr><td><a href="#type-string">string</a></td><td>DUK_TYPE_STRING</td><td>DUK_TYPE_MASK_STRING</td><td>immutable (plain) string or (plain) Symbol</td></tr>

								<tr><td><a href="#type-object">object</a></td><td>DUK_TYPE_OBJECT</td><td>DUK_TYPE_MASK_OBJECT</td><td>object with properties</td></tr>

								<tr><td><a href="#type-buffer">buffer</a></td><td>DUK_TYPE_BUFFER</td><td>DUK_TYPE_MASK_BUFFER</td><td>mutable (plain) byte buffer, fixed/dynamic/external; mimics an Uint8Array</td></tr>

								<tr><td><a href="#type-pointer">pointer</a></td><td>DUK_TYPE_POINTER</td><td>DUK_TYPE_MASK_POINTER</td><td>opaque pointer (void *)</td></tr>

								<tr><td><a href="#type-lightfunc">lightfunc</a></td><td>DUK_TYPE_LIGHTFUNC</td><td>DUK_TYPE_MASK_LIGHTFUNC</td><td>plain Duktape/C function pointer (non-object); mimics a Function</td></tr>

								</table>

								</div>


								<h2 id="type-memory-allocations">Memory allocations</h2>


								<p>The following stack types involve additional heap allocations:</p>


								<ul>

								<li>String: a single allocation contains a combined heap and string header,

								    followed by the immutable string data.</li>

								<li>Object: one allocation is used for a combined heap and object header,

								    and another allocation is used for object properties.  The property

								    allocation contains both array entries and normal properties, and if

								    the object is large enough, a hash table to speed up lookups.</li>

								<li>Buffer: for fixed buffers a single allocation contains a combined heap

								    and buffer header, followed by the mutable fixed-size buffer.  For

								    dynamic buffers the current buffer is allocated separately.  For

								    external buffers a single heap object is allocated and points to

								    a user buffer.</li>

								</ul>


								<p>Note that while strings and buffers are considered a primitive (pass-by-value)

								types, they are a heap allocated type from a memory allocation viewpoint.</p>


								<h2 id="type-pointer-stability">Pointer stability</h2>


								<p>Heap objects allocated by Duktape have stable pointers: the objects are

								not relocated in memory while they are reachable from a garbage collection

								point of view.  This is the case for the main heap object, but not

								necessarily for any additional allocations related to the object, such as

								dynamic property tables or dynamic buffer data area.  A heap object is

								reachable e.g. when it resides on the value stack of a reachable thread or

								is reachable through the global object.  Once a heap object becomes

								unreachable any pointers held by user C code referring to the object are

								unsafe and should no longer be dereferenced.</p>


								<p>In practice the only heap allocated data directly referenced by user code

								are strings, fixed buffers, and dynamic buffers.  The data area of strings

								and fixed buffers is stable; it is safe to keep a C pointer referring to the

								data even after a Duktape/C function returns as long the string or fixed

								buffer remains reachable from a garbage collection point of view at all times.

								Note that this is <i>not</i> the case for Duktape/C value stack arguments, for

								instance, unless specific arrangements are made.</p>


								<p>The data area of a dynamic buffer does <b>not</b> have a stable pointer.

								The buffer itself has a heap header with a stable address but the current

								buffer is allocated separately and potentially relocated when the buffer

								is resized.  It is thus unsafe to hold a pointer to a dynamic buffer's data

								area across a buffer resize, and it's probably best not to hold a pointer

								after a Duktape/C function returns (as there would be no easy way of being

								sure that the buffer hadn't been resized).  The data area of an external

								buffer also has a potentially changing pointer, but the pointer is only

								changed by an explicit user API call.</p>


								<p>For external buffers the stability of the data pointer is up to the user

								code which sets and updates the pointer.</p>


								<h2 id="type-masks">Type masks</h2>


								<p>Type masks allows calling code to easily check whether a type belongs to

								a certain type set.  For instance, to check that a certain stack value is

								a number, string, or an object:</p>


								<pre class="c-code">

								if (duk_get_type_mask(ctx, -3) &amp; (DUK_TYPE_MASK_NUMBER |

								                                  DUK_TYPE_MASK_STRING |

								                                  DUK_TYPE_MASK_OBJECT)) {

								    printf("type is number, string, or object\n");

								}

								</pre>


								<p>There is a specific API call for matching a set of types even more

								conveniently:</p>


								<pre class="c-code">

								if (duk_check_type_mask(ctx, -3, DUK_TYPE_MASK_NUMBER |

								                                 DUK_TYPE_MASK_STRING |

								                                 DUK_TYPE_MASK_OBJECT)) {

								    printf("type is number, string, or object\n");

								}

								</pre>


								<p>These are faster and more compact than the alternatives:</p>


								<pre class="c-code">

								// alt 1

								if (duk_is_number(ctx, -3) || duk_is_string(ctx, -3) || duk_is_object(ctx, -3)) {

								    printf("type is number, string, or object\n");

								}


								// alt 2

								int t = duk_get_type(ctx, -3);

								if (t == DUK_TYPE_NUMBER || t == DUK_TYPE_STRING || t == DUK_TYPE_OBJECT) {

								    printf("type is number, string, or object\n");

								}

								</pre>


								<h2 id="type-none">None</h2>


								<p>The <b>none</b> type is not actually a type but is used in the API to

								indicate that a value does not exist, a stack index is invalid, etc.</p>


								<h2 id="type-undefined">Undefined</h2>


								<p>The <b>undefined</b> type maps to Ecmascript <code>undefined</code>, which is

								distinguished from a <code>null</code>.</p>


								<p>Values read from outside the active value stack range read back as

								<b>undefined</b>.</p>


								<h2 id="type-null">Null</h2>


								<p>The <b>null</b> type maps to Ecmascript <code>null</code>.</p>


								<h2 id="type-boolean">Boolean</h2>


								<p>The <b>boolean</b> type is represented in the C API as an integer: zero for false,

								and non-zero for true.</p>


								<p>Whenever giving boolean values as arguments in API calls, any non-zero value is

								accepted as a "true" value.  Whenever API calls return boolean values, the value

								<code>1</code> is always used for a "true" value.  This allows certain C idioms to be

								used.  For instance, a bitmask can be built directly based on API call return values,

								as follows:

								</p>


								<pre class="c-code">

								/* this works and generates nice code */

								int bitmask = (duk_get_boolean(ctx, -3) &lt;&lt; 2) |

								              (duk_get_boolean(ctx, -2) &lt;&lt; 1) |

								              duk_get_boolean(ctx, -1);


								/* more verbose variant not relying on "true" being represented by 1 */

								int bitmask = ((duk_get_boolean(ctx, -3) ? 1 : 0) &lt;&lt; 2) |

								              ((duk_get_boolean(ctx, -2) ? 1 : 0) &lt;&lt; 1) |

								              (duk_get_boolean(ctx, -1) ? 1 : 0);


								/* another verbose variant */

								int bitmask = (duk_get_boolean(ctx, -3) ? (1 &lt;&lt; 2) : 0) |

								              (duk_get_boolean(ctx, -2) ? (1 &lt;&lt; 1) : 0) |

								              (duk_get_boolean(ctx, -1) ? 1 : 0);

								</pre>


								<h2 id="type-number">Number</h2>


								<p>The <b>number</b> type is an IEEE double, including +/- Infinity and NaN values.

								Zero sign is also preserved.  An IEEE double represents all integers up to 53 bits

								accurately.</p>


								<p>IEEE double allows NaN values to have additional signaling bits.  Because these

								bits are used by Duktape internal tagged type representation (when using 8-byte

								packed values), NaN values in the Duktape API are normalized.  Concretely, if you

								push a certain NaN value to the value stack, another (normalized) NaN value may

								come out.  Don't rely on NaNs preserving their exact form.</p>


								<h2 id="type-string">String</h2>


								<p>The <b>string</b> stack type is used to represent both plain strings and

								plain Symbols (introduced in ES2015).  A string is an arbitrary byte sequence

								of a certain length which may contain internal NUL (0x00) values.  Strings are

								always automatically NUL terminated for C coding convenience.  The NUL terminator

								is not counted as part of the string length.  For instance, the string

								<code>"foo"</code> has byte length 3 and is stored in memory as

								<code>{ 'f', 'o', 'o', '\0' }</code>.  Because of the guaranteed NUL termination,

								strings can always be pointed to using a simple <code>const char *</code> as long

								as internal NULs are not an issue for the application; if they are, the explicit

								byte length of the string can be queried with the API.  Calling code can refer

								directly to the string data held by Duktape.  Such string data pointers are valid

								(and stable) for as long as a string is reachable in the Duktape heap.</p>


								<p>Strings are <a href="http://en.wikipedia.org/wiki/String_interning">interned</a>

								for efficiency: only a single copy of a certain string ever exists at a time.

								Strings are immutable and must NEVER be changed by calling C code.  Doing so will

								lead to very mysterious issues which are hard to diagnose.</p>


								<p>Calling code most often deals with Ecmascript strings, which may contain

								arbitrary 16-bit codepoints in the range U+0000 to U+FFFF but cannot represent

								non-<a href="http://en.wikipedia.org/wiki/Basic_Multilingual_Plane#Basic_Multilingual_Plane">BMP</a>

								codepoints.  This is how strings are defined in the Ecmascript standard.

								In Duktape, Ecmascript strings are encoded with

								<a href="http://en.wikipedia.org/wiki/CESU-8">CESU-8</a> encoding.  CESU-8

								matches <a href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> except that it

								allows codepoints in the surrogate pair range (U+D800 to U+DFFF) to be encoded

								directly (prohibited in UTF-8).  CESU-8, like UTF-8, encodes all 7-bit ASCII

								characters as-is which is convenient for C code.  For example:</p>


								<ul>

								<li>U+0041 ("A") encodes to <code>41</code>.</li>

								<li>U+1234 (ETHIOPIC SYLLABLE SEE) encodes to <code>e1 88 b4</code>.</li>

								<li>U+D812 (high surrogate) encodes to <code>ed a0 92</code>. This would be

								    <a href="http://en.wikipedia.org/wiki/UTF-8#Invalid_code_points">invalid UTF-8</a>.</li>

								</ul>


								<a name="extended-utf8"></a>

								<p>Duktape also uses extended strings internally.  Codepoints up to U+10FFFF

								can be represented with UTF-8, and codepoints above that up to full 32 bits

								can be represented with

								<a href="http://en.wikipedia.org/wiki/UTF-8#Extending_from_31_bit_to_36_bit_range">extended UTF-8</a>.

								The extended UTF-8 encoding used by Duktape is described in the table below.

								The leading byte is shown in binary (with "x" marking data bits) while

								continuation bytes are marked with "C" (indicating the bit sequence 10xxxxxx):</p>


								<table>

								<thead>

								<tr class="header"><th>Codepoint range</th><th>Bits</th><th>Byte sequence</th><th>Notes</th></tr>

								</thead>

								<tbody>

								<tr><td>U+0000 to U+007F</td><td>7</td><td>0xxxxxxx</td><td></td></tr>

								<tr><td>U+0080 to U+07FF</td><td>11</td><td>110xxxxx C</td><td></td></tr>

								<tr><td>U+0800 to U+FFFF</td><td>16</td><td>1110xxxx C C</td><td>U+D800 to U+DFFF allowed (unlike UTF-8)</td></tr>

								<tr><td>U+1 0000 to U+1F FFFF</td><td>21</td><td>11110xxx C C C</td><td>Above U+10FFFF allowed (unlike UTF-8)</td></tr>

								<tr><td>U+20 0000 to U+3FF FFFF</td><td>26</td><td>111110xx C C C C</td><td></td></tr>

								<tr><td>U+400 0000 to U+7FFF FFFF</td><td>31</td><td>1111110x C C C C C</td><td></td></tr>

								<tr><td>U+8000 0000 to U+F FFFF FFFF</td><td>36</td><td>11111110 C C C C C C</td><td>Only 32 bits used in practice (up to U+FFFF FFFF)</td></tr>

								</tbody>

								</table>


								<p>The downside of the encoding for codepoints above U+7FFFFFFF is that

								the leading byte will be <code>0xFE</code> which conflicts with Unicode byte order

								marker encoding.  This is not a practical concern in Duktape's internal use.</p>


								<p>Finally, invalid extended UTF-8 byte sequences are used for special purposes

								such as representing Symbol values.  Invalid extened UTF-8/CESU-8 byte sequences

								never conflict with standard Ecmascript strings (which are CESU-8) and will remain

								cleanly separated within object property tables.  For more information see

								<a href="#symbols">Symbols</a> and

								<a href="https://github.com/svaarala/duktape/blob/master/doc/symbols.rst">symbols.rst</a>.</p>


								<p>Strings with invalid extended UTF-8 sequences can be pushed on the value stack

								from C code and also passed to Ecmascript functions, with two caveats:</p>

								<ul>

								<li>If the invalid byte sequence matches the internal format used to represent

								    Symbols, the value will appear as a Symbol rather than a string for Ecmascript

								    code.  For example, <code>typeof val</code> will be <code>symbol</code>.</li>

								<li>Behavior of string operations on invalid byte sequences if not well defined

								    and results may vary, and change even in minor Duktape version updates.</li>

								</ul>


								<h2 id="type-object">Object</h2>


								<p>The <b>object</b> type includes Ecmascript objects and arrays, functions,

								threads (coroutines), and buffer objects.  In other words, anything with

								properties is an object.  Properties are key-value pairs with a string key

								and an arbitrary value (including <b>undefined</b>).</p>


								<p>Objects may participate in garbage collection finalization.</p>


								<h2 id="type-buffer">Buffer</h2>


								<p>The plain <b>buffer</b> type is a raw buffer for user data.  It's much

								more memory efficient than standard buffer object types like Uint8Array

								or Node.js Buffer.  There are three plain buffer sub-types:</p>

								<table>

								<tr>

								<th>Buffer sub-type</th>

								<th>Data pointer</th>

								<th>Resizable</th>

								<th>Memory managed by</th>

								<th>Description</th>

								</tr>

								<tr>

								<td>Fixed</td>

								<td>Stable, non-NULL</td>

								<td>No</td>

								<td>Duktape</td>

								<td>Buffer size is fixed at creation, memory is managed automatically by

								    Duktape.  Fixed buffers have an unchanging (stable) non-NULL data

								    pointer.</td>

								</tr>

								<tr>

								<td>Dynamic</td>

								<td>Unstable, may be NULL</td>

								<td>Yes</td>

								<td>Duktape</td>

								<td>Buffer size can be changed after creation, memory is managed

								    automatically by Duktape.  Requires two memory allocations internally

								    to allow resizing.  Data pointer may change if buffer is resized.

								    Zero-size buffer may have a NULL data pointer.</td>

								</tr>

								<tr>

								<td>External</td>

								<td>Unstable, may be NULL</td>

								<td>Yes</td>

								<td>Duktape and user code</td>

								<td>Buffer data is externally allocated.  Duktape allocates and manages

								    a heap allocated buffer header structure, while the data area pointer

								    and length are configured by user code explicitly.  External buffers

								    are useful to allow a buffer to point to data structures outside

								    Duktape heap, e.g. a frame buffer allocated by a graphics library.

								    Zero-size buffer may have a NULL data pointer.</td>

								</tr>

								</table>


								<p>Unlike strings, buffer data areas are not automatically NUL-terminated

								and calling code must never access bytes beyond the currently allocated

								buffer size.  The data pointer of a zero-size dynamic or external buffer

								may be NULL; fixed buffers always have a non-NULL data pointer.</p>


								<p>Fixed and dynamic buffers are automatically garbage collected.  This

								also means that C code must not hold onto a buffer data pointer unless the

								buffer is reachable to Duktape, e.g. resides in an active value stack.

								The data area of an external buffer is not automatically garbage collected

								so user code is responsible for managing its life cycle.  User code must also

								make sure that no external buffer value is accessed when the external buffer

								is no longer (or not yet) valid.</p>


								<p>Plain buffers mimic Uint8Array objects to a large extent, so they are

								a non-standard, memory efficient alternative to working with e.g. ArrayBuffer

								and typed arrays.  The standard buffer objects are implemented on top of plain

								buffers, so that e.g. an ArrayBuffer will be backed by a plain buffer.  See

								<a href="#bufferobjects">Buffer objects</a> for more discussion.</p>


								<p>A few notes:</p>

								<ul>

								<li>Because the value written to a buffer index is number coerced, assigning

								a one-character value does not work as often expected.  For instance,

								<code>buf[123] = 'x'</code> causes zero to be written to the buffer, as

								ToNumber('x') = 0.  For clarity, you should only assign number values,

								e.g. <code>buf[123] = 0x78</code>.</li>

								<li>There is a fast path for reading and writing numeric indices of plain buffer

								values, e.g. <code>x = buf[123]</code> or <code>buf[123] = x</code>.  A similar

								fast path exists for the different buffer object values.</li>

								<li>Buffer virtual properties are not currently implemented in

								<code>defineProperty()</code>, so you can't write to buffer indices or

								buffer <code>length</code> with <code>defineProperty()</code> now (attempt

								to do so results in a <code>TypeError</code>).</li>

								</ul>


								<h2 id="type-pointer">Pointer</h2>


								<p>The <b>pointer</b> type is a raw, uninterpreted C pointer, essentially

								a <code>void *</code>.  Pointers can be used to point to native objects (memory

								allocations, handles, etc), but because Duktape doesn't know their use, they

								are not automatically garbage collected.  You can, however, put one or more

								pointers inside an object and use the object finalizer to free the

								native resources related to the pointer(s).</p>


								<h2 id="type-lightfunc">Lightfunc</h2>


								<p>The <b>lightfunc</b> type is a plain Duktape/C function pointer and a small

								set of control flags packed into a single tagged value which requires no separate

								heap allocations.  The control flags (16 bits currently) encode:

								(1) number of stack arguments expected by the Duktape/C function (0 to 14 or

								varargs), (2) virtual <code>length</code> property value (0 to 15), and

								(3) a magic value (-128 to 127).</p>


								<p>Lightfuncs are a separate tagged type in the Duktape C API, but behave mostly

								like Function objects for Ecmascript code.  They have significant limitations

								compared to ordinary Function objects, the most important being:</p>


								<ul>

								<li>Lightfuncs cannot hold own properties.  They have a fixed virtual <code>.name</code>

								    property which appears in tracebacks, and a virtual <code>.length</code> property.

								    Other properties are inherited from <code>Function.prototype</code>.</li>

								<li>Lightfuncs can be used as constructor functions, but cannot have a

								    <code>.prototype</code> property.  If you need to construct objects which

								    don't inherit from <code>Object.prototype</code> (the default), you need to

								    either (a) construct and return an instance explicitly in the constructor or

								    (b) explicitly override the internal prototype of the default instance in the

								    constructor.</li>

								<li>Lightfuncs can be used as accessor properties (getters/setters), but they

								    get converted to actual functions so that their memory advantage is lost.

								    See

								    <a href="https://github.com/svaarala/duktape/blob/master/tests/ecmascript/test-dev-lightfunc-accessor.js">test-dev-lightfunc-accessor.js</a>.</li>

								<li>Lightfuncs cannot have a finalizer as they are a primitive type and

								    don't have a reference count field or otherwise participate in garbage

								    collection.  See

								    <a href="https://github.com/svaarala/duktape/blob/master/tests/ecmascript/test-dev-lightfunc-finalizer.js">test-dev-lightfunc-finalizer.js</a>.</li>

								</ul>


								<p>Lightfuncs are useful for very low memory environments where the memory

								impact of ordinary Function objects matters.  See

								<a href="#functionobjects">Function objects</a> for more discussion.</p>