mirror of https://github.com/svaarala/duktape.git
Sami Vaarala
8 years ago
committed by
GitHub
20 changed files with 193 additions and 182 deletions
@ -0,0 +1,7 @@ |
|||
<div class="note"> |
|||
Symbol values are visible in the C API as strings, e.g. <code>duk_is_string()</code> |
|||
is true (this behavior is similar to Duktape 1.x internal strings). Symbols are |
|||
still an experimental feature. For now, you can distinguish Symbols from ordinary |
|||
strings by looking at their initial byte, see |
|||
<a href="https://github.com/svaarala/duktape/blob/master/doc/symbols.rst">symbols.rst</a>. |
|||
</div> |
@ -1,110 +0,0 @@ |
|||
<h1 id="internalproperties">Internal properties</h1> |
|||
|
|||
<p>Duktape supports non-standard <b>internal properties</b> which are |
|||
essentially hidden from user code. They can only be accessed by a |
|||
direct property read/write, and are never enumerated, serialized by |
|||
<code>JSON.stringify()</code> or returned from built-in functions such |
|||
as <code>Object.getOwnPropertyNames()</code>.</p> |
|||
|
|||
<p>Duktape uses internal properties for various implementation specific |
|||
purposes, such as storing an object's finalizer reference, the internal |
|||
value held by <code>Number</code> and <code>Date</code>, etc. User code |
|||
can also use internal properties for its own purposes, e.g. to |
|||
store "hidden state" in objects, as long as the property names never |
|||
conflict with current or future Duktape internal keys (this is ensured |
|||
by the naming convention described below). User code should never try |
|||
to access Duktape's internal properties: the set of internal properties |
|||
used can change arbitrarily between versions.</p> |
|||
|
|||
<p>Internal properties are distinguished from other properties by the |
|||
property key: if the byte representation of a property key begins with |
|||
a <code>0xFF</code> byte Duktape automatically treats the property as an |
|||
internal property. Such a string is referred to as an <b>internal string</b>. |
|||
The initial byte makes the key invalid UTF-8 (even invalid extended UTF-8), |
|||
which ensures that (1) internal properties never conflict with normal Unicode |
|||
property names and that (2) ordinary Ecmascript code cannot accidentally access |
|||
them. The initial prefix byte is often represented by an underscore in |
|||
documentation for readability, e.g. <code>_Value</code> is used instead |
|||
of <code>\xFFValue</code>.</p> |
|||
|
|||
<p>The following naming convention is used. The convention ensures that |
|||
Duktape and user internal properties never conflict:</p> |
|||
<table> |
|||
<tr> |
|||
<th>Type</th> |
|||
<th>Example (C)</th> |
|||
<th>Bytes</th> |
|||
<th>Description</th> |
|||
</tr> |
|||
<tr> |
|||
<td>Duktape</td> |
|||
<td><code>"\xFF" "Value"</code></td> |
|||
<td><code>ff 56 61 6c 75 65</code></td> |
|||
<td>First character is always uppercase, followed by <code>[a-z0-9_]*</code>.</td> |
|||
</tr> |
|||
<tr> |
|||
<td>User</td> |
|||
<td><code>"\xFF" "myprop"</code></td> |
|||
<td><code>ff 6d 79 70 72 6f 70</code></td> |
|||
<td>First character must not be uppercase to avoid conflict with |
|||
current or future Duktape keys.</td> |
|||
</tr> |
|||
<tr> |
|||
<td>User</td> |
|||
<td><code>"\xFF\xFF" <arbitrary></code></td> |
|||
<td><code>ff ff <arbitrary></code></td> |
|||
<td>Double <code>0xFF</code> prefix followed by arbitrary data.</td> |
|||
</tr> |
|||
</table> |
|||
|
|||
<p>In some cases the internal key needed by user code is not static, e.g. |
|||
it can be dynamically generated by serializing a pointer or perhaps the |
|||
bytes are from an external source. In this case it is safest to use |
|||
two <code>0xFF</code> prefix bytes as the example above shows.</p> |
|||
|
|||
<div class="note"> |
|||
Note that the <code>0xFF</code> prefix cannot be expressed as a valid |
|||
Ecmascript string. For example, the internal string <code>\xFFxyz</code> |
|||
would appear as the bytes <code>ff 78 79 7a</code> in memory, while the |
|||
Ecmascript string <code>"\u00ffxyz"</code> would be represented as the |
|||
CESU-8 bytes <code>c3 bf 78 79 7a</code> in memory. |
|||
</div> |
|||
|
|||
<p>Creating an internal string is easy from C code:</p> |
|||
<pre class="c-code"> |
|||
/* Create an internal string, which can then be used to read/write internal |
|||
* properties, and can be passed on to Ecmascript code like any other string. |
|||
* Terminating a string literal after a hex escape is safest to avoid some |
|||
* ambiguous cases like "\xffab". |
|||
*/ |
|||
duk_push_string(ctx, "\xff" "myprop"); |
|||
</pre> |
|||
|
|||
<p>For more discussion on C string hex escaping, see |
|||
<a href="https://github.com/svaarala/duktape/blob/master/misc/c_hex_esc.c">c_hex_esc.c</a>.</p> |
|||
|
|||
<p>Internal strings cannot be created from Ecmascript code using the default |
|||
built-ins alone. However, application code can easily add such a binding |
|||
using the C API which must be considered in sandboxing.</p> |
|||
|
|||
<p>There's no special access control for internal properties: if user code has |
|||
access to the property name (string), it can read/write the property value. |
|||
The default Ecmascript built-ins don't provide a way of creating an internal |
|||
string: buffer-to-string coercions always involve an encoding such as UTF-8 |
|||
which will reject or replace invalid byte sequences. However, C code can |
|||
easily create internal strings. When sandboxing, ensure that custom C bindings |
|||
don't accidentally provide a mechanism to create internal strings by e.g. |
|||
converting a buffer as-is to a string.</p> |
|||
|
|||
<p>As a concrete example the internal value of a <code>Date</code> instance |
|||
can be accessed as follows:</p> |
|||
<pre class="ecmascript-code"> |
|||
// Print the internal timestamp of a Date instance. Assumes a hypothetical |
|||
// rawBufferToString() custom C binding which takes an input buffer and pushes |
|||
// the bytes as-is as a string using duk_push_lstring(), thus creating an |
|||
// internal string. |
|||
|
|||
var key = rawBufferToString(Duktape.dec('hex', 'ff56616c7565')); // \xFFValue |
|||
var dt = new Date(123456); |
|||
print('internal value is:', dt[key]); // prints 123456 |
|||
</pre> |
@ -0,0 +1,58 @@ |
|||
<a name="internalproperties"></a> <!-- legacy links --> |
|||
<h1 id="symbols">Symbols</h1> |
|||
|
|||
<p>Duktape supports ES2015 Symbols and also provides a Duktape specific |
|||
<b>hidden Symbol</b> variant similar to internal strings in Duktape 1.x. |
|||
Hidden Symbols differ from ES2015 Symbols in that they're hidden from |
|||
ordinary Ecmascript code: they can't be created from Ecmascript code, |
|||
won't be enumerated or JSON-serialized, and won't be returned even from |
|||
<code>Object.getOwnPropertyNames()</code>. Properties with hidden Symbol |
|||
keys can only be accessed by a direct property read/write when holding a |
|||
reference to a hidden Symbol.</p> |
|||
|
|||
<p>Duktape uses hidden Symbols for various implementation specific purposes, |
|||
such as storing an object's finalizer reference. User code can also use hidden |
|||
Symbols for its own purposes, e.g. to store hidden state in objects. User code |
|||
should never try to access Duktape's hidden Symbol keyed properties: the set of |
|||
such properties can change arbitrarily between versions.</p> |
|||
|
|||
<p>Symbols of all kinds are represented internally using byte sequences which |
|||
are invalid UTF-8; see |
|||
<a href="https://github.com/svaarala/duktape/blob/master/doc/symbols.rst">symbols.rst</a> |
|||
for the current formats in use. When C code pushes a string using e.g. |
|||
<code>duk_push_string()</code> and the byte sequence matches an internal |
|||
Symbol format, the string value is automatically interpreted as a Symbol.</p> |
|||
|
|||
<div class="note"> |
|||
Note that the internal UTF-8 byte sequences cannot be created from Ecmascript |
|||
code as a valid Ecmascript string. For example, a hidden Symbol might be |
|||
represented using <code>\xFFxyz</code>, i.e. the byte sequence |
|||
<code>ff 78 79 7a</code>, while the Ecmascript string <code>"\u00ffxyz"</code> |
|||
would be represented as the CESU-8 bytes <code>c3 bf 78 79 7a</code> in memory. |
|||
</div> |
|||
|
|||
<p>Creating a Symbol is straightforward from C code:</p> |
|||
<pre class="c-code"> |
|||
/* Create a hidden Symbol which can then be used to read/write properties. |
|||
* The Symbol can be passed on to Ecmascript code like any other string or |
|||
* Symbol. Terminating a string literal after a hex escape is safest to |
|||
* avoid some ambiguous cases like "\xffab". |
|||
*/ |
|||
duk_push_string(ctx, "\xff" "mySymbol"); |
|||
</pre> |
|||
|
|||
<p>For more discussion on C string hex escaping, see |
|||
<a href="https://github.com/svaarala/duktape/blob/master/misc/c_hex_esc.c">c_hex_esc.c</a>.</p> |
|||
|
|||
<p>Hidden Symbols cannot be created from Ecmascript code using the default |
|||
built-ins alone. Standard ES2015 Symbols can be created using the |
|||
<code>Symbol</code> built-in, e.g. as <code>Symbol.for('foo')</code>. |
|||
When sandboxing, ensure that application C bindings don't accidentally provide |
|||
a mechanism to create hidden Symbols by e.g. converting an input buffer as-is |
|||
to a string without applying an encoding.</p> |
|||
|
|||
<p>There's currently no special access control for properties with hidden |
|||
Symbol keys: if user code has access to the Symbol, it can read/write the |
|||
property value. This will most likely change in future major versions so |
|||
that Ecmascript code cannot access a property with a hidden Symbol key, |
|||
even when holding a reference to the hidden Symbol value.</p> |
Loading…
Reference in new issue