From 3e08c58b936b9393afe80e14866149f074a0afd8 Mon Sep 17 00:00:00 2001
From: Sami Vaarala If the target is a Proxy object which implements the Without any flags the enumeration behaves like If the target is a Proxy object which implements the
key
argument is internally coerced to a string. There is
- an internal fast path for arrays and numeric indices which avoids an
- explicit string coercion, so use a numeric key
when applicable.key
argument is internally coerced using ToPropertyKey()
+ coercion which results in a string or a Symbol. There is an internal
+ fast path for arrays and numeric indices which avoids an explicit string
+ coercion, so use a numeric key
when applicable.deleteProperty
diff --git a/website/api/duk_enum.yaml b/website/api/duk_enum.yaml
index 4115ff76..a3637132 100644
--- a/website/api/duk_enum.yaml
+++ b/website/api/duk_enum.yaml
@@ -19,9 +19,20 @@ summary: |
properties are enumerated
-
+ DUK_ENUM_INCLUDE_INTERNAL
- Enumerate also internal properties, by default internal properties
- are not enumerated
+ DUK_ENUM_INCLUDE_HIDDEN
+ Enumerate also hidden Symbols, by default hidden Symbols are not
+ enumerated. Use together with
+ DUK_ENUM_INCLUDE_SYMBOLS
.
+ In Duktape 1.x this flag was called DUK_ENUM_INCLUDE_INTERNAL
.
+
+ DUK_ENUM_INCLUDE_SYMBOLS
+ Include Symbols in the enumeration result. Hidden Symbols are not
+ included unless
+ DUK_ENUM_INCLUDE_HIDDEN
is specified.
+
DUK_ENUM_EXCLUDE_STRINGS
+ Exclude strings from the enumeration result. By default strings are
+ included.
+ DUK_ENUM_OWN_PROPERTIES_ONLY
@@ -39,6 +50,10 @@ summary: |
enumeration result rather than per inheritance level, this has the
effect of sorting array indices (even when inherited)
+
DUK_ENUM_NO_PROXY_BEHAVIOR
+ Enumerate a Proxy object itself without invoking Proxy behaviors.
+ for-in
:
diff --git a/website/api/duk_get_prop.yaml b/website/api/duk_get_prop.yaml
index 2c0ec53b..050ad849 100644
--- a/website/api/duk_get_prop.yaml
+++ b/website/api/duk_get_prop.yaml
@@ -34,9 +34,10 @@ summary: |
String
and you can access its
"length"
property.key
argument is internally coerced to a string. There is
- an internal fast path for arrays and numeric indices which avoids an
- explicit string coercion, so use a numeric key
when applicable.key
argument is internally coerced using ToPropertyKey()
+ coercion which results in a string or a Symbol. There is an internal
+ fast path for arrays and numeric indices which avoids an explicit string
+ coercion, so use a numeric key
when applicable.get
trap,
diff --git a/website/api/duk_get_string.yaml b/website/api/duk_get_string.yaml
index 75ca9723..499e6b48 100644
--- a/website/api/duk_get_string.yaml
+++ b/website/api/duk_get_string.yaml
@@ -21,6 +21,8 @@ summary: |
this differs from how buffer data pointers are handled (for technical reasons).
+ DUK_TYPE_xxx
or DUK_TYPE_NONE
if idx
is invalid.
duk_check_type_mask()
call is
even more convenient for this purpose).
+
+
example: |
if (duk_get_type_mask(ctx, -3) & (DUK_TYPE_MASK_STRING |
DUK_TYPE_MASK_NUMBER)) {
diff --git a/website/api/duk_has_prop.yaml b/website/api/duk_has_prop.yaml
index 5722ea02..6f6e2689 100644
--- a/website/api/duk_has_prop.yaml
+++ b/website/api/duk_has_prop.yaml
@@ -27,9 +27,10 @@ summary: |
String
and you can check for its
"length"
property.key
argument is internally coerced to a string. There is
- an internal fast path for arrays and numeric indices which avoids an
- explicit string coercion, so use a numeric key
when applicable.key
argument is internally coerced using ToPropertyKey()
+ coercion which results in a string or a Symbol. There is an internal
+ fast path for arrays and numeric indices which avoids an explicit string
+ coercion, so use a numeric key
when applicable.If the target is a Proxy object which implements the has
trap,
diff --git a/website/api/duk_is_string.yaml b/website/api/duk_is_string.yaml
index 5ffe2af5..e681041a 100644
--- a/website/api/duk_is_string.yaml
+++ b/website/api/duk_is_string.yaml
@@ -10,6 +10,8 @@ summary: |
Returns 1 if value at idx
is a string, otherwise
returns 0. If idx
is invalid, also returns 0.
NULL
is returned. This behavior differs from
duk_push_lstring
on purpose.
- C code should normally only push valid CESU-8 strings to the stack.
+Symbol
for Ecmascript code.
+ See Symbols for more discussion.
+ If input string might contain internal NUL characters, use
duk_push_lstring()
instead.
key
argument is internally coerced to a string. There is
- an internal fast path for arrays and numeric indices which avoids an
- explicit string coercion, so use a numeric key
when applicable.key
argument is internally coerced using ToPropertyKey()
+ coercion which results in a string or a Symbol. There is an internal
+ fast path for arrays and numeric indices which avoids an explicit string
+ coercion, so use a numeric key
when applicable.If the target is a Proxy object which implements the set
trap,
diff --git a/website/api/duk_require_string.yaml b/website/api/duk_require_string.yaml
index 4a4b8cc8..eabcca44 100644
--- a/website/api/duk_require_string.yaml
+++ b/website/api/duk_require_string.yaml
@@ -11,6 +11,8 @@ summary: |
but throws an error if the value at idx
is not a string
or if the index is invalid.
duk_is_string()
+is true (this behavior is similar to Duktape 1.x internal strings). Symbols are
+still an experimental feature. For now, you can distinguish Symbols from ordinary
+strings by looking at their initial byte, see
+symbols.rst.
+Objects may have internal properties which
-are essentially hidden from normal code: they won't be enumerated or returned
-even by e.g. Object.getOwnPropertyNames()
. Ordinary Ecmascript
-code cannot refer to such properties because the property keys intentionally
-use invalid UTF-8 (0xFF
prefix byte).
Objects may have properties with hidden Symbol keys.
+These are similar to ES2015 Symbols but won't be enumerated or returned from even
+Object.getOwnPropertySymbols()
. Ordinary Ecmascript code cannot
+refer to such properties because the keys intentionally use an invalid (extended)
+UTF-8 representation.
Duktape supports non-standard internal properties which are
-essentially hidden from user code. They can only be accessed by a
-direct property read/write, and are never enumerated, serialized by
-JSON.stringify()
or returned from built-in functions such
-as Object.getOwnPropertyNames()
.
Duktape uses internal properties for various implementation specific
-purposes, such as storing an object's finalizer reference, the internal
-value held by Number
and Date
, etc. User code
-can also use internal properties for its own purposes, e.g. to
-store "hidden state" in objects, as long as the property names never
-conflict with current or future Duktape internal keys (this is ensured
-by the naming convention described below). User code should never try
-to access Duktape's internal properties: the set of internal properties
-used can change arbitrarily between versions.
Internal properties are distinguished from other properties by the
-property key: if the byte representation of a property key begins with
-a 0xFF
byte Duktape automatically treats the property as an
-internal property. Such a string is referred to as an internal string.
-The initial byte makes the key invalid UTF-8 (even invalid extended UTF-8),
-which ensures that (1) internal properties never conflict with normal Unicode
-property names and that (2) ordinary Ecmascript code cannot accidentally access
-them. The initial prefix byte is often represented by an underscore in
-documentation for readability, e.g. _Value
is used instead
-of \xFFValue
.
The following naming convention is used. The convention ensures that -Duktape and user internal properties never conflict:
-Type | -Example (C) | -Bytes | -Description | -
---|---|---|---|
Duktape | -"\xFF" "Value" |
-ff 56 61 6c 75 65 |
-First character is always uppercase, followed by [a-z0-9_]* . |
-
User | -"\xFF" "myprop" |
-ff 6d 79 70 72 6f 70 |
-First character must not be uppercase to avoid conflict with -current or future Duktape keys. | -
User | -"\xFF\xFF" <arbitrary> |
-ff ff <arbitrary> |
-Double 0xFF prefix followed by arbitrary data. |
-
In some cases the internal key needed by user code is not static, e.g.
-it can be dynamically generated by serializing a pointer or perhaps the
-bytes are from an external source. In this case it is safest to use
-two 0xFF
prefix bytes as the example above shows.
0xFF
prefix cannot be expressed as a valid
-Ecmascript string. For example, the internal string \xFFxyz
-would appear as the bytes ff 78 79 7a
in memory, while the
-Ecmascript string "\u00ffxyz"
would be represented as the
-CESU-8 bytes c3 bf 78 79 7a
in memory.
-Creating an internal string is easy from C code:
--/* Create an internal string, which can then be used to read/write internal - * properties, and can be passed on to Ecmascript code like any other string. - * Terminating a string literal after a hex escape is safest to avoid some - * ambiguous cases like "\xffab". - */ -duk_push_string(ctx, "\xff" "myprop"); -- -
For more discussion on C string hex escaping, see -c_hex_esc.c.
- -Internal strings cannot be created from Ecmascript code using the default -built-ins alone. However, application code can easily add such a binding -using the C API which must be considered in sandboxing.
- -There's no special access control for internal properties: if user code has -access to the property name (string), it can read/write the property value. -The default Ecmascript built-ins don't provide a way of creating an internal -string: buffer-to-string coercions always involve an encoding such as UTF-8 -which will reject or replace invalid byte sequences. However, C code can -easily create internal strings. When sandboxing, ensure that custom C bindings -don't accidentally provide a mechanism to create internal strings by e.g. -converting a buffer as-is to a string.
- -As a concrete example the internal value of a Date
instance
-can be accessed as follows:
-// Print the internal timestamp of a Date instance. Assumes a hypothetical -// rawBufferToString() custom C binding which takes an input buffer and pushes -// the bytes as-is as a string using duk_push_lstring(), thus creating an -// internal string. - -var key = rawBufferToString(Duktape.dec('hex', 'ff56616c7565')); // \xFFValue -var dt = new Date(123456); -print('internal value is:', dt[key]); // prints 123456 -diff --git a/website/guide/intro.html b/website/guide/intro.html index 869c52f8..b6f23adf 100644 --- a/website/guide/intro.html +++ b/website/guide/intro.html @@ -177,7 +177,7 @@ wrappers are discussed in detail. Finalization, Coroutines, Virtual properties, -Internal properties, +Symbols, Bytecode dump/load, Threading, Sandboxing. diff --git a/website/guide/stacktypes.html b/website/guide/stacktypes.html index 190f3ab8..013305f3 100644 --- a/website/guide/stacktypes.html +++ b/website/guide/stacktypes.html @@ -10,7 +10,7 @@
null
true
and false
The string type is an arbitrary byte sequence of a certain length which
-may contain internal NUL (0x00) values. Strings are always automatically NUL
-terminated for C coding convenience. The NUL terminator is not counted as part
-of the string length. For instance, the string "foo"
has byte length 3
-and is stored in memory as { 'f', 'o', 'o', '\0' }
. Because of the
-guaranteed NUL termination, strings can always be pointed to using a simple
-const char *
as long as internal NULs are not an issue; if they are,
-the explicit byte length of the string can be queried with the API. Calling code
-can refer directly to the string data held by Duktape. Such string data
-pointers are valid (and stable) for as long as a string is reachable in the
-Duktape heap.
The string stack type is used to represent both plain strings and
+plain Symbols (introduced in ES2015). A string is an arbitrary byte sequence
+of a certain length which may contain internal NUL (0x00) values. Strings are
+always automatically NUL terminated for C coding convenience. The NUL terminator
+is not counted as part of the string length. For instance, the string
+"foo"
has byte length 3 and is stored in memory as
+{ 'f', 'o', 'o', '\0' }
. Because of the guaranteed NUL termination,
+strings can always be pointed to using a simple const char *
as long
+as internal NULs are not an issue for the application; if they are, the explicit
+byte length of the string can be queried with the API. Calling code can refer
+directly to the string data held by Duktape. Such string data pointers are valid
+(and stable) for as long as a string is reachable in the Duktape heap.
Strings are interned for efficiency: only a single copy of a certain string ever exists at a time. @@ -212,13 +213,7 @@ characters as-is which is convenient for C code. For example:
can be represented with UTF-8, and codepoints above that up to full 32 bits can be represented with extended UTF-8. -Non-standard strings are used for storing internal object properties; using a -non-standard string ensures that such properties never conflict with properties -accessible using standard Ecmascript strings. Non-standard strings can be given -to Ecmascript built-in functions, but since behavior may not be exactly -specified, results may vary. - -The extended UTF-8 encoding used by Duktape is described in the table below. +The extended UTF-8 encoding used by Duktape is described in the table below. The leading byte is shown in binary (with "x" marking data bits) while continuation bytes are marked with "C" (indicating the bit sequence 10xxxxxx):
@@ -241,8 +236,22 @@ continuation bytes are marked with "C" (indicating the bit sequence 10xxxxxx): the leading byte will be0xFE
which conflicts with Unicode byte order
marker encoding. This is not a practical concern in Duktape's internal use.
-The leading 0xFF
byte never appears in Duktape's extended UTF-8
-encoding, and is used to implement internal properties.
Finally, invalid extended UTF-8 byte sequences are used for special purposes +such as representing Symbol values. Invalid extened UTF-8/CESU-8 byte sequences +never conflict with standard Ecmascript strings (which are CESU-8) and will remain +cleanly separated within object property tables. For more information see +Symbols and +symbols.rst.
+ +Strings with invalid extended UTF-8 sequences can be pushed on the value stack +from C code and also passed to Ecmascript functions, with two caveats:
+typeof val
will be symbol
.Duktape supports ES2015 Symbols and also provides a Duktape specific
+hidden Symbol variant similar to internal strings in Duktape 1.x.
+Hidden Symbols differ from ES2015 Symbols in that they're hidden from
+ordinary Ecmascript code: they can't be created from Ecmascript code,
+won't be enumerated or JSON-serialized, and won't be returned even from
+Object.getOwnPropertyNames()
. Properties with hidden Symbol
+keys can only be accessed by a direct property read/write when holding a
+reference to a hidden Symbol.
Duktape uses hidden Symbols for various implementation specific purposes, +such as storing an object's finalizer reference. User code can also use hidden +Symbols for its own purposes, e.g. to store hidden state in objects. User code +should never try to access Duktape's hidden Symbol keyed properties: the set of +such properties can change arbitrarily between versions.
+ +Symbols of all kinds are represented internally using byte sequences which
+are invalid UTF-8; see
+symbols.rst
+for the current formats in use. When C code pushes a string using e.g.
+duk_push_string()
and the byte sequence matches an internal
+Symbol format, the string value is automatically interpreted as a Symbol.
\xFFxyz
, i.e. the byte sequence
+ff 78 79 7a
, while the Ecmascript string "\u00ffxyz"
+would be represented as the CESU-8 bytes c3 bf 78 79 7a
in memory.
+Creating a Symbol is straightforward from C code:
++/* Create a hidden Symbol which can then be used to read/write properties. + * The Symbol can be passed on to Ecmascript code like any other string or + * Symbol. Terminating a string literal after a hex escape is safest to + * avoid some ambiguous cases like "\xffab". + */ +duk_push_string(ctx, "\xff" "mySymbol"); ++ +
For more discussion on C string hex escaping, see +c_hex_esc.c.
+ +Hidden Symbols cannot be created from Ecmascript code using the default
+built-ins alone. Standard ES2015 Symbols can be created using the
+Symbol
built-in, e.g. as Symbol.for('foo')
.
+When sandboxing, ensure that application C bindings don't accidentally provide
+a mechanism to create hidden Symbols by e.g. converting an input buffer as-is
+to a string without applying an encoding.
There's currently no special access control for properties with hidden +Symbol keys: if user code has access to the Symbol, it can read/write the +property value. This will most likely change in future major versions so +that Ecmascript code cannot access a property with a hidden Symbol key, +even when holding a reference to the hidden Symbol value.
From 60bd97e216e11ad0aaf27d706f8bb6f5e2bfd55b Mon Sep 17 00:00:00 2001 From: Sami Vaarala