From 3e08c58b936b9393afe80e14866149f074a0afd8 Mon Sep 17 00:00:00 2001 From: Sami Vaarala Date: Mon, 19 Dec 2016 02:08:02 +0200 Subject: [PATCH 1/2] Website and API doc updates for Symbol coverage --- website/api/duk_del_prop.yaml | 7 +- website/api/duk_enum.yaml | 21 ++++- website/api/duk_get_prop.yaml | 7 +- website/api/duk_get_string.yaml | 2 + website/api/duk_get_type.yaml | 2 + website/api/duk_get_type_mask.yaml | 2 + website/api/duk_has_prop.yaml | 7 +- website/api/duk_is_string.yaml | 2 + website/api/duk_push_string.yaml | 9 ++- website/api/duk_put_prop.yaml | 7 +- website/api/duk_require_string.yaml | 2 + website/api/duk_to_string.yaml | 4 + website/api/symbols-are-strings.html | 7 ++ website/buildsite.py | 4 +- website/guide/custombehavior.html | 12 +-- website/guide/internalproperties.html | 110 -------------------------- website/guide/intro.html | 2 +- website/guide/stacktypes.html | 51 +++++++----- website/guide/symbols.html | 58 ++++++++++++++ 19 files changed, 160 insertions(+), 156 deletions(-) create mode 100644 website/api/symbols-are-strings.html delete mode 100644 website/guide/internalproperties.html create mode 100644 website/guide/symbols.html diff --git a/website/api/duk_del_prop.yaml b/website/api/duk_del_prop.yaml index b9349f58..d6ab7369 100644 --- a/website/api/duk_del_prop.yaml +++ b/website/api/duk_del_prop.yaml @@ -33,9 +33,10 @@ summary: |

If the target is a Proxy object which implements the deleteProperty diff --git a/website/api/duk_enum.yaml b/website/api/duk_enum.yaml index 4115ff76..a3637132 100644 --- a/website/api/duk_enum.yaml +++ b/website/api/duk_enum.yaml @@ -19,9 +19,20 @@ summary: | properties are enumerated - DUK_ENUM_INCLUDE_INTERNAL - Enumerate also internal properties, by default internal properties - are not enumerated + DUK_ENUM_INCLUDE_HIDDEN + Enumerate also hidden Symbols, by default hidden Symbols are not + enumerated. Use together with DUK_ENUM_INCLUDE_SYMBOLS. + In Duktape 1.x this flag was called DUK_ENUM_INCLUDE_INTERNAL. + + + DUK_ENUM_INCLUDE_SYMBOLS + Include Symbols in the enumeration result. Hidden Symbols are not + included unless DUK_ENUM_INCLUDE_HIDDEN is specified. + + + DUK_ENUM_EXCLUDE_STRINGS + Exclude strings from the enumeration result. By default strings are + included. DUK_ENUM_OWN_PROPERTIES_ONLY @@ -39,6 +50,10 @@ summary: | enumeration result rather than per inheritance level, this has the effect of sorting array indices (even when inherited) + + DUK_ENUM_NO_PROXY_BEHAVIOR + Enumerate a Proxy object itself without invoking Proxy behaviors. +

Without any flags the enumeration behaves like for-in: diff --git a/website/api/duk_get_prop.yaml b/website/api/duk_get_prop.yaml index 2c0ec53b..050ad849 100644 --- a/website/api/duk_get_prop.yaml +++ b/website/api/duk_get_prop.yaml @@ -34,9 +34,10 @@ summary: |

  • The target value is automatically coerced to an object. For instance, a string is converted to a String and you can access its "length" property.
  • -
  • The key argument is internally coerced to a string. There is - an internal fast path for arrays and numeric indices which avoids an - explicit string coercion, so use a numeric key when applicable.
  • +
  • The key argument is internally coerced using ToPropertyKey() + coercion which results in a string or a Symbol. There is an internal + fast path for arrays and numeric indices which avoids an explicit string + coercion, so use a numeric key when applicable.
  • If the target is a Proxy object which implements the get trap, diff --git a/website/api/duk_get_string.yaml b/website/api/duk_get_string.yaml index 75ca9723..499e6b48 100644 --- a/website/api/duk_get_string.yaml +++ b/website/api/duk_get_string.yaml @@ -21,6 +21,8 @@ summary: | this differs from how buffer data pointers are handled (for technical reasons). +

    + example: | const char *buf; diff --git a/website/api/duk_get_type.yaml b/website/api/duk_get_type.yaml index 0220c2f4..8d3cc4bb 100644 --- a/website/api/duk_get_type.yaml +++ b/website/api/duk_get_type.yaml @@ -11,6 +11,8 @@ summary: | DUK_TYPE_xxx or DUK_TYPE_NONE if idx is invalid.

    +
    + example: | if (duk_get_type(ctx, -3) == DUK_TYPE_NUMBER) { printf("value is a number\n"); diff --git a/website/api/duk_get_type_mask.yaml b/website/api/duk_get_type_mask.yaml index 69ea7e2e..32db2d3e 100644 --- a/website/api/duk_get_type_mask.yaml +++ b/website/api/duk_get_type_mask.yaml @@ -15,6 +15,8 @@ summary: | (the duk_check_type_mask() call is even more convenient for this purpose).

    +
    + example: | if (duk_get_type_mask(ctx, -3) & (DUK_TYPE_MASK_STRING | DUK_TYPE_MASK_NUMBER)) { diff --git a/website/api/duk_has_prop.yaml b/website/api/duk_has_prop.yaml index 5722ea02..6f6e2689 100644 --- a/website/api/duk_has_prop.yaml +++ b/website/api/duk_has_prop.yaml @@ -27,9 +27,10 @@ summary: |
  • The target value is automatically coerced to an object. For instance, a string is converted to a String and you can check for its "length" property.
  • -
  • The key argument is internally coerced to a string. There is - an internal fast path for arrays and numeric indices which avoids an - explicit string coercion, so use a numeric key when applicable.
  • +
  • The key argument is internally coerced using ToPropertyKey() + coercion which results in a string or a Symbol. There is an internal + fast path for arrays and numeric indices which avoids an explicit string + coercion, so use a numeric key when applicable.
  • If the target is a Proxy object which implements the has trap, diff --git a/website/api/duk_is_string.yaml b/website/api/duk_is_string.yaml index 5ffe2af5..e681041a 100644 --- a/website/api/duk_is_string.yaml +++ b/website/api/duk_is_string.yaml @@ -10,6 +10,8 @@ summary: |

    Returns 1 if value at idx is a string, otherwise returns 0. If idx is invalid, also returns 0.

    +
    + example: | if (duk_is_string(ctx, -3)) { /* ... */ diff --git a/website/api/duk_push_string.yaml b/website/api/duk_push_string.yaml index 3a9a6901..9f55d190 100644 --- a/website/api/duk_push_string.yaml +++ b/website/api/duk_push_string.yaml @@ -17,7 +17,14 @@ summary: | to the stack and NULL is returned. This behavior differs from duk_push_lstring on purpose.

    -

    C code should normally only push valid CESU-8 strings to the stack.

    +
    + C code should normally only push valid CESU-8 strings to the stack. + Some invalid CESU-8/UTF-8 byte sequences are reserved for special + uses such as representing Symbol values. When you push such an invalid + byte sequence, the value on the value stack will behave like a string for + C code but will appear as a Symbol for Ecmascript code. + See Symbols for more discussion. +

    If input string might contain internal NUL characters, use duk_push_lstring() instead.

    diff --git a/website/api/duk_put_prop.yaml b/website/api/duk_put_prop.yaml index 452aef40..72460720 100644 --- a/website/api/duk_put_prop.yaml +++ b/website/api/duk_put_prop.yaml @@ -35,9 +35,10 @@ summary: | transitory objects (see PutValue (V, W), step 7 of the special [[Put]] variant). -
  • The key argument is internally coerced to a string. There is - an internal fast path for arrays and numeric indices which avoids an - explicit string coercion, so use a numeric key when applicable.
  • +
  • The key argument is internally coerced using ToPropertyKey() + coercion which results in a string or a Symbol. There is an internal + fast path for arrays and numeric indices which avoids an explicit string + coercion, so use a numeric key when applicable.
  • If the target is a Proxy object which implements the set trap, diff --git a/website/api/duk_require_string.yaml b/website/api/duk_require_string.yaml index 4a4b8cc8..eabcca44 100644 --- a/website/api/duk_require_string.yaml +++ b/website/api/duk_require_string.yaml @@ -11,6 +11,8 @@ summary: | but throws an error if the value at idx is not a string or if the index is invalid.

    +
    + example: | const char *buf; diff --git a/website/api/duk_to_string.yaml b/website/api/duk_to_string.yaml index bc741d67..06bfdafc 100644 --- a/website/api/duk_to_string.yaml +++ b/website/api/duk_to_string.yaml @@ -14,6 +14,10 @@ summary: |
    +
    + ToString() coercion for a Symbol value causes a TypeError. +
    +
    In Duktape 2.x plain buffers mimic ArrayBuffer objects and will usually ToString() coerce to "[object ArrayBuffer]". To convert buffer or buffer diff --git a/website/api/symbols-are-strings.html b/website/api/symbols-are-strings.html new file mode 100644 index 00000000..50a17d79 --- /dev/null +++ b/website/api/symbols-are-strings.html @@ -0,0 +1,7 @@ +
    +Symbol values are visible in the C API as strings, e.g. duk_is_string() +is true (this behavior is similar to Duktape 1.x internal strings). Symbols are +still an experimental feature. For now, you can distinguish Symbols from ordinary +strings by looking at their initial byte, see +symbols.rst. +
    diff --git a/website/buildsite.py b/website/buildsite.py index f599b85c..ae58083b 100644 --- a/website/buildsite.py +++ b/website/buildsite.py @@ -955,7 +955,7 @@ def generateGuide(): navlinks.append(['#finalization', 'Finalization']) navlinks.append(['#coroutines', 'Coroutines']) navlinks.append(['#virtualproperties', 'Virtual properties']) - navlinks.append(['#internalproperties', 'Internal properties']) + navlinks.append(['#symbols', 'Symbols']) navlinks.append(['#bytecodedumpload', 'Bytecode dump/load']) navlinks.append(['#threading', 'Threading']) navlinks.append(['#sandboxing', 'Sandboxing']) @@ -1006,7 +1006,7 @@ def generateGuide(): res += processRawDoc('guide/finalization.html') res += processRawDoc('guide/coroutines.html') res += processRawDoc('guide/virtualproperties.html') - res += processRawDoc('guide/internalproperties.html') + res += processRawDoc('guide/symbols.html') res += processRawDoc('guide/bytecodedumpload.html') res += processRawDoc('guide/threading.html') res += processRawDoc('guide/sandboxing.html') diff --git a/website/guide/custombehavior.html b/website/guide/custombehavior.html index db5d88eb..343ecdf9 100644 --- a/website/guide/custombehavior.html +++ b/website/guide/custombehavior.html @@ -9,13 +9,13 @@ other relevant specifications.

    access to Duktape specific features. Also the buffer, pointer, and lightfunc types are custom.

    -

    Internal properties

    +

    Hidden Symbols

    -

    Objects may have internal properties which -are essentially hidden from normal code: they won't be enumerated or returned -even by e.g. Object.getOwnPropertyNames(). Ordinary Ecmascript -code cannot refer to such properties because the property keys intentionally -use invalid UTF-8 (0xFF prefix byte).

    +

    Objects may have properties with hidden Symbol keys. +These are similar to ES2015 Symbols but won't be enumerated or returned from even +Object.getOwnPropertySymbols(). Ordinary Ecmascript code cannot +refer to such properties because the keys intentionally use an invalid (extended) +UTF-8 representation.

    "use duk notail" directive

    diff --git a/website/guide/internalproperties.html b/website/guide/internalproperties.html deleted file mode 100644 index a7a9b2cf..00000000 --- a/website/guide/internalproperties.html +++ /dev/null @@ -1,110 +0,0 @@ -

    Internal properties

    - -

    Duktape supports non-standard internal properties which are -essentially hidden from user code. They can only be accessed by a -direct property read/write, and are never enumerated, serialized by -JSON.stringify() or returned from built-in functions such -as Object.getOwnPropertyNames().

    - -

    Duktape uses internal properties for various implementation specific -purposes, such as storing an object's finalizer reference, the internal -value held by Number and Date, etc. User code -can also use internal properties for its own purposes, e.g. to -store "hidden state" in objects, as long as the property names never -conflict with current or future Duktape internal keys (this is ensured -by the naming convention described below). User code should never try -to access Duktape's internal properties: the set of internal properties -used can change arbitrarily between versions.

    - -

    Internal properties are distinguished from other properties by the -property key: if the byte representation of a property key begins with -a 0xFF byte Duktape automatically treats the property as an -internal property. Such a string is referred to as an internal string. -The initial byte makes the key invalid UTF-8 (even invalid extended UTF-8), -which ensures that (1) internal properties never conflict with normal Unicode -property names and that (2) ordinary Ecmascript code cannot accidentally access -them. The initial prefix byte is often represented by an underscore in -documentation for readability, e.g. _Value is used instead -of \xFFValue.

    - -

    The following naming convention is used. The convention ensures that -Duktape and user internal properties never conflict:

    - - - - - - - - - - - - - - - - - - - - - - - - - -
    TypeExample (C)BytesDescription
    Duktape"\xFF" "Value"ff 56 61 6c 75 65First character is always uppercase, followed by [a-z0-9_]*.
    User"\xFF" "myprop"ff 6d 79 70 72 6f 70First character must not be uppercase to avoid conflict with -current or future Duktape keys.
    User"\xFF\xFF" <arbitrary>ff ff <arbitrary>Double 0xFF prefix followed by arbitrary data.
    - -

    In some cases the internal key needed by user code is not static, e.g. -it can be dynamically generated by serializing a pointer or perhaps the -bytes are from an external source. In this case it is safest to use -two 0xFF prefix bytes as the example above shows.

    - -
    -Note that the 0xFF prefix cannot be expressed as a valid -Ecmascript string. For example, the internal string \xFFxyz -would appear as the bytes ff 78 79 7a in memory, while the -Ecmascript string "\u00ffxyz" would be represented as the -CESU-8 bytes c3 bf 78 79 7a in memory. -
    - -

    Creating an internal string is easy from C code:

    -
    -/* Create an internal string, which can then be used to read/write internal
    - * properties, and can be passed on to Ecmascript code like any other string.
    - * Terminating a string literal after a hex escape is safest to avoid some
    - * ambiguous cases like "\xffab".
    - */
    -duk_push_string(ctx, "\xff" "myprop");
    -
    - -

    For more discussion on C string hex escaping, see -c_hex_esc.c.

    - -

    Internal strings cannot be created from Ecmascript code using the default -built-ins alone. However, application code can easily add such a binding -using the C API which must be considered in sandboxing.

    - -

    There's no special access control for internal properties: if user code has -access to the property name (string), it can read/write the property value. -The default Ecmascript built-ins don't provide a way of creating an internal -string: buffer-to-string coercions always involve an encoding such as UTF-8 -which will reject or replace invalid byte sequences. However, C code can -easily create internal strings. When sandboxing, ensure that custom C bindings -don't accidentally provide a mechanism to create internal strings by e.g. -converting a buffer as-is to a string.

    - -

    As a concrete example the internal value of a Date instance -can be accessed as follows:

    -
    -// Print the internal timestamp of a Date instance.  Assumes a hypothetical
    -// rawBufferToString() custom C binding which takes an input buffer and pushes
    -// the bytes as-is as a string using duk_push_lstring(), thus creating an
    -// internal string.
    -
    -var key = rawBufferToString(Duktape.dec('hex', 'ff56616c7565'));  // \xFFValue
    -var dt = new Date(123456);
    -print('internal value is:', dt[key]);  // prints 123456
    -
    diff --git a/website/guide/intro.html b/website/guide/intro.html index 869c52f8..b6f23adf 100644 --- a/website/guide/intro.html +++ b/website/guide/intro.html @@ -177,7 +177,7 @@ wrappers are discussed in detail.

    Finalization, Coroutines, Virtual properties, -Internal properties, +Symbols, Bytecode dump/load, Threading, Sandboxing. diff --git a/website/guide/stacktypes.html b/website/guide/stacktypes.html index 190f3ab8..013305f3 100644 --- a/website/guide/stacktypes.html +++ b/website/guide/stacktypes.html @@ -10,7 +10,7 @@ nullDUK_TYPE_NULLDUK_TYPE_MASK_NULLnull booleanDUK_TYPE_BOOLEANDUK_TYPE_MASK_BOOLEANtrue and false numberDUK_TYPE_NUMBERDUK_TYPE_MASK_NUMBERIEEE double -stringDUK_TYPE_STRINGDUK_TYPE_MASK_STRINGimmutable (plain) string +stringDUK_TYPE_STRINGDUK_TYPE_MASK_STRINGimmutable (plain) string or (plain) Symbol objectDUK_TYPE_OBJECTDUK_TYPE_MASK_OBJECTobject with properties bufferDUK_TYPE_BUFFERDUK_TYPE_MASK_BUFFERmutable (plain) byte buffer, fixed/dynamic/external; mimics an ArrayBuffer pointerDUK_TYPE_POINTERDUK_TYPE_MASK_POINTERopaque pointer (void *) @@ -172,17 +172,18 @@ come out. Don't rely on NaNs preserving their exact form.

    String

    -

    The string type is an arbitrary byte sequence of a certain length which -may contain internal NUL (0x00) values. Strings are always automatically NUL -terminated for C coding convenience. The NUL terminator is not counted as part -of the string length. For instance, the string "foo" has byte length 3 -and is stored in memory as { 'f', 'o', 'o', '\0' }. Because of the -guaranteed NUL termination, strings can always be pointed to using a simple -const char * as long as internal NULs are not an issue; if they are, -the explicit byte length of the string can be queried with the API. Calling code -can refer directly to the string data held by Duktape. Such string data -pointers are valid (and stable) for as long as a string is reachable in the -Duktape heap.

    +

    The string stack type is used to represent both plain strings and +plain Symbols (introduced in ES2015). A string is an arbitrary byte sequence +of a certain length which may contain internal NUL (0x00) values. Strings are +always automatically NUL terminated for C coding convenience. The NUL terminator +is not counted as part of the string length. For instance, the string +"foo" has byte length 3 and is stored in memory as +{ 'f', 'o', 'o', '\0' }. Because of the guaranteed NUL termination, +strings can always be pointed to using a simple const char * as long +as internal NULs are not an issue for the application; if they are, the explicit +byte length of the string can be queried with the API. Calling code can refer +directly to the string data held by Duktape. Such string data pointers are valid +(and stable) for as long as a string is reachable in the Duktape heap.

    Strings are interned for efficiency: only a single copy of a certain string ever exists at a time. @@ -212,13 +213,7 @@ characters as-is which is convenient for C code. For example:

    can be represented with UTF-8, and codepoints above that up to full 32 bits can be represented with extended UTF-8. -Non-standard strings are used for storing internal object properties; using a -non-standard string ensures that such properties never conflict with properties -accessible using standard Ecmascript strings. Non-standard strings can be given -to Ecmascript built-in functions, but since behavior may not be exactly -specified, results may vary.

    - -

    The extended UTF-8 encoding used by Duktape is described in the table below. +The extended UTF-8 encoding used by Duktape is described in the table below. The leading byte is shown in binary (with "x" marking data bits) while continuation bytes are marked with "C" (indicating the bit sequence 10xxxxxx):

    @@ -241,8 +236,22 @@ continuation bytes are marked with "C" (indicating the bit sequence 10xxxxxx):0xFE which conflicts with Unicode byte order marker encoding. This is not a practical concern in Duktape's internal use.

    -

    The leading 0xFF byte never appears in Duktape's extended UTF-8 -encoding, and is used to implement internal properties.

    +

    Finally, invalid extended UTF-8 byte sequences are used for special purposes +such as representing Symbol values. Invalid extened UTF-8/CESU-8 byte sequences +never conflict with standard Ecmascript strings (which are CESU-8) and will remain +cleanly separated within object property tables. For more information see +Symbols and +symbols.rst.

    + +

    Strings with invalid extended UTF-8 sequences can be pushed on the value stack +from C code and also passed to Ecmascript functions, with two caveats:

    +
      +
    • If the invalid byte sequence matches the internal format used to represent + Symbols, the value will appear as a Symbol rather than a string for Ecmascript + code. For example, typeof val will be symbol.
    • +
    • Behavior of string operations on invalid byte sequences if not well defined + and results may vary, and change even in minor Duktape version updates.
    • +

    Object

    diff --git a/website/guide/symbols.html b/website/guide/symbols.html new file mode 100644 index 00000000..aad4b35d --- /dev/null +++ b/website/guide/symbols.html @@ -0,0 +1,58 @@ + +

    Symbols

    + +

    Duktape supports ES2015 Symbols and also provides a Duktape specific +hidden Symbol variant similar to internal strings in Duktape 1.x. +Hidden Symbols differ from ES2015 Symbols in that they're hidden from +ordinary Ecmascript code: they can't be created from Ecmascript code, +won't be enumerated or JSON-serialized, and won't be returned even from +Object.getOwnPropertyNames(). Properties with hidden Symbol +keys can only be accessed by a direct property read/write when holding a +reference to a hidden Symbol.

    + +

    Duktape uses hidden Symbols for various implementation specific purposes, +such as storing an object's finalizer reference. User code can also use hidden +Symbols for its own purposes, e.g. to store hidden state in objects. User code +should never try to access Duktape's hidden Symbol keyed properties: the set of +such properties can change arbitrarily between versions.

    + +

    Symbols of all kinds are represented internally using byte sequences which +are invalid UTF-8; see +symbols.rst +for the current formats in use. When C code pushes a string using e.g. +duk_push_string() and the byte sequence matches an internal +Symbol format, the string value is automatically interpreted as a Symbol.

    + +
    +Note that the internal UTF-8 byte sequences cannot be created from Ecmascript +code as a valid Ecmascript string. For example, a hidden Symbol might be +represented using \xFFxyz, i.e. the byte sequence +ff 78 79 7a, while the Ecmascript string "\u00ffxyz" +would be represented as the CESU-8 bytes c3 bf 78 79 7a in memory. +
    + +

    Creating a Symbol is straightforward from C code:

    +
    +/* Create a hidden Symbol which can then be used to read/write properties.
    + * The Symbol can be passed on to Ecmascript code like any other string or
    + * Symbol.  Terminating a string literal after a hex escape is safest to
    + * avoid some ambiguous cases like "\xffab".
    + */
    +duk_push_string(ctx, "\xff" "mySymbol");
    +
    + +

    For more discussion on C string hex escaping, see +c_hex_esc.c.

    + +

    Hidden Symbols cannot be created from Ecmascript code using the default +built-ins alone. Standard ES2015 Symbols can be created using the +Symbol built-in, e.g. as Symbol.for('foo'). +When sandboxing, ensure that application C bindings don't accidentally provide +a mechanism to create hidden Symbols by e.g. converting an input buffer as-is +to a string without applying an encoding.

    + +

    There's currently no special access control for properties with hidden +Symbol keys: if user code has access to the Symbol, it can read/write the +property value. This will most likely change in future major versions so +that Ecmascript code cannot access a property with a hidden Symbol key, +even when holding a reference to the hidden Symbol value.

    From 60bd97e216e11ad0aaf27d706f8bb6f5e2bfd55b Mon Sep 17 00:00:00 2001 From: Sami Vaarala Date: Tue, 20 Dec 2016 22:39:11 +0200 Subject: [PATCH 2/2] Sandboxing doc updates for symbols --- doc/sandboxing.rst | 59 ++++++++++++++++++++++++++-------------------- 1 file changed, 33 insertions(+), 26 deletions(-) diff --git a/doc/sandboxing.rst b/doc/sandboxing.rst index 816b279b..eb48ade4 100644 --- a/doc/sandboxing.rst +++ b/doc/sandboxing.rst @@ -25,6 +25,11 @@ carefully written with these sandboxing goals in mind. This document describes best practices for Duktape sandboxing. +There's a YAML config file with some useful default options for sandboxing, +and comments on what options you might consider: + +* ``config/examples/security_sensitive.yaml`` + .. note:: This document described the current status of sandboxing features which is not yet a complete solution. @@ -88,12 +93,12 @@ Verbose error messages may cause sandboxing security issues: * When ``DUK_USE_PARANOID_ERRORS`` is not set, offending object/key is summarized in an error message of some rejected property operations. If object keys contain potentially sensitive information, you should - enable this option. + enable this option. Disable ``DUK_USE_PARANOID_ERRORS``. * When stack traces are enabled an attacker may gain useful information from the stack traces. Further, access to the internal ``_Tracedata`` property provides access to call chain functions even when references to them are not - available directly. + available directly. Disable ``DUK_USE_TRACEBACKS``. Replace the global object ------------------------- @@ -124,30 +129,32 @@ Risky bindings: finalizers are a sandboxing risk. It's also possible to override or unset a finalizer which the sandbox relies on. -* Since Duktape 2.x buffer bindings no longer provide a way create "internal" - strings which allow access to internal properties. See separate section on - internal properties. +* Since Duktape 2.x buffer bindings no longer provide a way create hidden + Symbols (called "internal strings" in Duktape 1.x) which allow access to + internal properties. See separate section on internal properties. You should also: -* Remove the ``require`` module loading function in the global object. - If you need module loading in the sandbox, it's better to write a specific, +* Remove the ``require`` module loading function in the global object + (since Duktape 2.x it's no longer present by default). If you need + module loading in the sandbox, it's better to write a specific, constrained module loader for that environment. Restrict access to internal properties -------------------------------------- -Internal properties are intended to be used by Duktape and user C code -to store "hidden properties" in objects. The mechanism currently relies on -using strings whose internal representation contains invalid UTF-8/CESU-8 data, -in concrete terms, a 0xFF prefix. These are called "internal strings". Since +Internal properties are used by Duktape and user C code to store "hidden +properties" in objects. The mechanism currently relies on "hidden Symbols" +(called "internal keys" or "internal strings" in Duktape 1.x). These are +strings whose internal representation contains invalid UTF-8/CESU-8 data +(see ``doc/symbols.rst`` for description of the current formats). Because all standard Ecmascript strings are represented as CESU-8, such strings cannot normally be created by Ecmascript code. The properties are also never -enumerated or otherwise exposed to Ecmascript code, so that the only way to -access them from Ecmascript code is to have access to an "internal string" -acting as the property key. +enumerated or otherwise exposed to Ecmascript code (not even by +``Object.getOwnPropertySymbols()``) so that the only way to access them from +Ecmascript code is to have access to a hidden Symbol acting as the property key. -C code can create internal keys very easily, which can provide a way to access +C code can create hidden Symbols very easily, which can provide a way to access internal properties. For example:: // Assume an application native binding returns an internal key pushed @@ -165,18 +172,18 @@ be modified, concrete security issues may arise. For instance, if an internal property stores a raw pointer to a native handle (such as a ``FILE *``), changing its value can lead to a potentially exploitable segfault. -Since Duktape 2.x Ecmascript code cannot create internal keys using standard -Ecmascript code and the built-in bindings alone. To prevent access to internal -keys, ensure that no native bindings provided by the sandboxing environment +Since Duktape 2.x Ecmascript code cannot create hidden Symbols using standard +Ecmascript code and the built-in bindings alone. To prevent access to hidden +Symbols, ensure that no native bindings provided by the sandboxing environment accidentally return such strings. The easiest way to ensure this is to make sure all strings pushed on the value stack are properly CESU-8 encoded. It's also good practice to ensure that sandboxed code has minimal access to -objects with potentially dangerous keys like raw pointers. +objects with potentially dangerous properties like raw pointers. -.. note:: There's a future work issue, potentially included in Duktape 2.x, +.. note:: There's a future work issue, potentially included in Duktape 3.x, for preventing access to internal properties from Ecmascript code - even when using the correct internal key. + even when using the correct hidden Symbol as a lookup key. Restrict access to function instances ------------------------------------- @@ -234,9 +241,9 @@ string methods with a plain base value:: print("foo".toUpperCase()); -Duktape 1.0 will use the original built-in prototype functions in these -inheritance situations. There is currently no way to replace these built-ins -so that the replacements would be used for instead (see +Duktape uses the original built-in prototype functions in these inheritance +situations. There is currently no way to replace these built-ins so that the +replacements would be used for instead (see ``test-dev-sandbox-prototype-limitation.js``). As a result, sandboxed code will always have access to the built-in prototype @@ -261,7 +268,7 @@ objects which participate in implicit inheritance: through explicit construction (if constructors visible) or implicitly through internal errors, e.g. ``/foo\123/`` which throws a SyntaxError -* ``ArrayBuffer.prototype``: through buffer values (if available); since +* ``Uint8Array.prototype``: through buffer values (if available); since there is no buffer literal, user cannot construct buffer values directly * ``Duktape.Pointer.prototype`` through pointer values (if available); since @@ -367,7 +374,7 @@ vulnerabilities. To avoid such issues: must match; patch version may vary as bytecode format doesn't change in patch versions. -* Ensure integrity of bytecode being loaded e.g. by checksumming. +* Ensure integrity of bytecode being loaded e.g. by checksumming or signing. * If bytecode is transported over the network or other unsafe media, use cryptographic means (keyed hashing, signatures, or similar) to