You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

506 lines
21 KiB

======================
duk_hobject algorithms
======================
Overview
========
Purpose
-------
This document and referenced sub-documents discuss, in detail, the internal
algorithms for dealing with objects, in particular for object property access.
These algorithms are based on the algorithm descriptions in the E5
specification, which have been refined towards the practical implementation
needs e.g. by combining multiple algorithms, inlining calls, and inlining
"exotic behaviors" (term borrowed from ES6).
The intent is to describe versions of the conceptual algorithms most suited
for implementation, without actually going into implementation level details.
Only complicated core algorithms and built-in methods are covered.
One important question is to identify the *exposed interface* of operations
invoked from concrete object-related expressions of Ecmascript code. These
primitives are also in almost 1:1 relationship with the internal bytecode
operations.
Call and constructor related algorithms (``[[Call]]`` and ``[[Construct]]``)
are covered in ``execution.rst``.
.. note:: This document is not completely up-to-date: some special behaviors
(like buffer and proxy objects) have been added on top of the algorithms
described here.
Related sections of E5 specification
------------------------------------
For raw property algorithms:
* E5 Section 8.12: the default algorithms
* E5 Section 8.6.2: one paragraph lists exotic behaviors (page 33, PDF page 43)
* E5 Section 10.6: arguments object
* E5 Section 15.5.5.2: String object
* E5 Section 15.3.4.5.3, 15.3.5.3, 15.3.5.4: Function object
* E5 Section 15.4.5.1: Array object
For other algorithms:
* E5 Section 8.10.4: FromPropertyDescriptor
* E5 Section 8.10.5: ToPropertyDescriptor
* E5 Section 8.7.1: GetValue
* E5 Section 8.7.2: PutValue
* E5 Section 11.2.1: property accessor expression
* E5 Section 11.13: assignment operators (note that several other places
also "evaluate" property accessor expressions)
* E5 Section 11.4.1: ``delete`` operator
* E5 Section 11.8.6: ``instanceof`` operator
* E5 Section 11.8.7: ``in`` operator
* E5 Section 15.2.3.3: ``Object.getOwnPropertyDescriptor()``
* E5 Section 15.2.3.6: ``Object.defineProperty()``
* E5 Section 15.2.3.7: ``Object.defineProperties()``
Algorithm overview
------------------
Ecmascript object property access behavior is described by internal property
handling algorithms. The default algorithms are described in E5 Section 8.12.
There are some objects with exotic behaviors; these have variants of the
default property handling algorithms. See ``hobject-design.rst`` for a
detailed discussion of exotic behaviors.
The "raw" property algorithms are:
* The default ``[[GetOwnProperty]](P)`` algorithm (E5 Section 8.12.1)
* Modified behavior for: String object (E5 Section 15.5.5.2)
* Modified behavior for: Arguments object created for non-strict functions (E5 Section 10.6)
* The default ``[[GetProperty]](P)`` algorithm (E5 Section 8.12.1)
* The default ``[[Get]](P)`` algorithm (E5 Section 8.12.1)
* Modified behavior for: Arguments object created for non-strict functions (E5 Section 10.6)
* Modified behavior for: Function object (E5 Section 15.3.5.4)
* Note that exotic behaviors of ``[[Get]]`` are '''not''' visible through property
descriptors at all.
* The default ``[[CanPut]](P)`` algorithm (E5 Section 8.12.1)
* The default ``[[Put]](P,V,Throw)`` algorithm (E5 Section 8.12.1)
* The default ``[[HasProperty]](P)`` algorithm (E5 Section 8.12.1)
* Modified behavior for: Function object (E5 Section 15.3.5.3)
* Modified behavior for: bound Function objects (E5 Section 15.3.4.5.3)
* The default ``[[Delete]](P,Throw)`` algorithm (E5 Section 8.12.1)
* Modified behavior for: Arguments object created for non-strict functions (E5 Section 10.6)
* The default ``[[DefineOwnProperty]](P,Desc,Throw)`` algorithm (E5 Section 8.12.1)
* Modified behavior for: Array object (E5 Section 15.4.5.1)
* Modified behavior for: Arguments object created for non-strict functions (E5 Section 10.6)
The following figure illustrates the caller-callee relationship between the
default property algorithms (the figure would be somewhat different for
exotic behavior variants)::
[[Put]]
|
|
.----+---------+
| | |
| v |
[[HasProperty]] [[Get]] | [[CanPut]] |
| | | | | |
`--------. | | | | |
| | | | | .---'
| | | | | |
v v v v | |
[[Delete]] [[GetProperty]] | | [[DefineOwnProperty]]
| | | | |
| | .-' | |
| | | .--' |
`---------. | | | .-----------'
| | | | |
v v v v v
[[GetOwnProperty]]
However, these algorithms are only the "raw" property handling algorithms.
Actual property access operations in Ecmascript code are "wrapped" by e.g.:
* The evaluation rules for the specific expression (e.g. property read).
These usually contain type checks, some coercions, etc.
* ``GetValue()`` and ``PutValue()`` (E5 Section 8.7) are used for property
property read/write operations in Ecmascript code. The algorithms are
wrappers for ``[[Get]]`` and ``[[Put]]`` which allow the base reference
to be a non-object, coercing it to a (temporary) object first. This allows
expressions like::
print("foo".length);
Important questions
-------------------
From an implementation perspective there are many questions which don't have
an easy answer in the E5 specification, e.g.:
* How are these internal algorithms visible to user code? This exposed
interface places hard requirements on viable implementation approaches,
whereas internal behavior can be implemented in several ways.
* What do the internal algorithms look like after you "inline" the calls
used in the specification (which often obscure the true semantics)?
* What do the internal algorithms look like if exotic behaviors are
"inlined" into one algorithm which supports both the default and all the
exotic behaviors?
* What do the internal algorithms look like once we add "fast paths" for
array index access (where the fast path avoids string interning if
possible)?
* What do the internal algorithms look like once we consider the internal
``duk_hobject`` representation (e.g. separation between entry and array
parts)?
The purpose of this document is to provide answers to these questions, and
act as a basis for implementing the rather tricky required behavior
accurately. The internal algorithms are discussed, inlined, and reformulated
to a more useful form. The sections are organized on the basis of practical
implementations needs, i.e. the context where internal algorithms are actually
needed.
What's not covered?
-------------------
This document does not go into full implementation detail in the algorithms.
Algorithms remain at a conceptual level.
In particular, the following are not covered:
* Relatively simple algorithms or built-in methods are not covered.
For instance, ``Object.defineProperty()`` is covered but
``Object.seal()`` is not, because ``Object.seal()`` is simple enough
to be implemented (and later verified) directly.
* Reference counts and reachability for garbage collection. These are
critical and sometimes difficult to implement correctly.
* Internal errors such as out-of-memory which may happen at any point
but are not mentioned in the algorithms. Where appropriate, the steps
in abstract algorithms *are* adjusted to minimize inconsistencies if
an internal error occurs.
* ``duk_hobject`` entry and array part separation, which affects all
operations dealing with properties.
* Error ``message`` strings for particular kinds of error. The E5
specification only mandates error type (its class) but never mandates
any texts.
* Concrete code structure or ordering; actual implementation may have a
slightly different structure.
Exposed interface
=================
What is an exposed interface?
-----------------------------
The relevant *exposed interface* is the set of object related operations
which can be invoked from Ecmascript code, e.g.::
// property write
o.foo = "bar";
// property read
print(o.foo);
// property deletion
delete o.foo;
// property existence check
print('foo' in o);
// object class membership test
print(x instanceof Array);
It also covers intricate built-in methods, such as::
var t = Object.getOwnPropertyDescriptor(o, 'foo');
Object.defineOwnProperty(o, 'foo', { value: 'bar' });
Object.defineProperties(o, {
foo: { value: 'bar' },
bar: { value: 'quux' }
});
These exposed primitives are discussed in this section.
Contexts and property algorithms used
-------------------------------------
The following table lists all contexts where property algorithms are
invoked from user code. (All ``Object`` built-in methods are listed
for completeness although not all of them invoke property algorithms.)
+---------------------------------------+---------------------------------+
| Context | Related algorithms / notes |
+=======================================+=================================+
| Property read | Property accessor reference, |
| | ``GetValue()`` for the |
| | reference, ``[[Get]]``. If |
| | base reference is not an object,|
| | ``[[GetProperty]]``, |
| | ``[[Call]]``. |
+---------------------------------------+---------------------------------+
| Property write | Property accessor reference, |
| | ``PutValue()`` for the |
| | reference, ``[[Put]]``. If |
| | base reference is not an object,|
| | ``[[CanPut]]``, |
| | ``[[GetOwnProperty]]``, |
| | ``[[GetProperty]]``, |
| | ``[[Call]]``. |
+---------------------------------------+---------------------------------+
| Property ``delete`` | Property accessor reference, |
| | ``[[Delete]]``. |
+---------------------------------------+---------------------------------+
| ``in`` | ``[[HasProperty]]``. |
+---------------------------------------+---------------------------------+
| ``instanceof`` | ``[[HasInstance]]``. |
+---------------------------------------+---------------------------------+
| "for-in" enumeration | Enumeration order guarantees, |
| | see ``hobject-design.rst``. |
+---------------------------------------+---------------------------------+
| ``Object.getPrototypeOf()`` | Just returns internal prototype.|
+---------------------------------------+---------------------------------+
| ``Object.getOwnPropertyDescriptor()`` | ``[[GetOwnProperty]]``, |
| | ``FromPropertyDescriptor()`` |
| | for a fully populated property |
| | descriptor. |
+---------------------------------------+---------------------------------+
| ``Object.getOwnPropertyNames()`` | Creates a result array, uses |
| | ``[[DefineOwnProperty]]`` |
| | internally. |
+---------------------------------------+---------------------------------+
| ``Object.create()`` | No direct use of property |
| | algorithms but conceptually |
| | calls |
| | ``Object.defineProperties()`` |
| | internally. |
+---------------------------------------+---------------------------------+
| ``Object.defineProperty()`` | ``ToPropertyDescriptor()``, |
| | ``[[DefineOwnProperty]]`` with |
| | an arbitrary descriptor. |
+---------------------------------------+---------------------------------+
| ``Object.defineProperties()`` | ``ToPropertyDescriptor()``, |
| | ``[[DefineOwnProperty[]`` with |
| | an arbitrary descriptor. |
+---------------------------------------+---------------------------------+
| ``Object.seal()`` | ``[[GetOwnProperty]]``, |
| | ``[[DefineOwnProperty]]``; |
| | sets ``[[Extensible]]`` to |
| | false. |
+---------------------------------------+---------------------------------+
| ``Object.freeze()`` | ``[[GetOwnProperty]]``, |
| | ``[[DefineOwnProperty]]``, |
| | sets ``[[Extensible]]`` to |
| | false. |
+---------------------------------------+---------------------------------+
| ``Object.preventExtensions()`` | Sets ``[[Extensible]]`` to |
| | false. |
+---------------------------------------+---------------------------------+
| ``Object.isSealed()`` | ``[[GetOwnProperty]]``, reads |
| | ``[[Extensible]]``. |
+---------------------------------------+---------------------------------+
| ``Object.isFrozen()`` | ``[[GetOwnProperty]]``, reads |
| | ``[[Extensible]]``. |
+---------------------------------------+---------------------------------+
| ``Object.isExtensible()`` | Reads ``[[Extensible]]``. |
+---------------------------------------+---------------------------------+
| ``Object.keys()`` | Key order must match "for-in" |
| | enumeration order. |
+---------------------------------------+---------------------------------+
| ``Object.prototype.hasOwnProperty()`` | ``[[GetOwnProperty]]`` |
| | (does *not* use |
| | ``[[HasProperty]]``) |
+---------------------------------------+---------------------------------+
Central exposed primitives
--------------------------
The central exposed primitives are as follows. Some have been given an
internal name which corresponds to the bytecode instruction:
* GETPROP: property read expression: coercion wrapping, ``GetValue()``,
``[[Get]]``, and a special ``[[Get]]`` variant if base is primitive
* PUTPROP: property write expression: coercion wrapping, ``PutValue()``,
``[[Put]]``, and a special ``[[Put]]`` variant if base is primitive
* DELPROP: ``delete`` operator: coercion wrapping, ``[[Delete]]``
* HASPROP: ``in`` operator: type check wrapping, ``[[HasProperty]]``
* INSTOF: ``instanceof`` operator: type check wrapping, ``[[HasInstance]]``
+ Not a property related primitive directly, but tied to the
prototype chain
* ``Object.getOwnPropertyDescriptor()``
+ But not ``[[GetOwnProperty]]`` or ``[[GetProperty]]`` directly
+ Not "fast path" so implementation should be compact
+ Only throwing variant (``Throw`` is ``true``)
* ``Object.defineProperty()`` and ``Object.defineProperties()``
+ But not ``[[DefineOwnProperty]]`` directly
+ Not "fast path" so implementation should be compact
+ Only throwing variant (``Throw`` is ``true``)
These are used to implement the basic property related run-time operations
and some difficult built-in functions. They are also used to implement
the C API and are also basic bytecode operations.
The remaining primitives (like ``Object.seal()`` etc) are trivial in
comparison, and are not analyzed in this document.
Notes
-----
* ``Object.getOwnPropertyDescriptor()``, ``Object.defineProperty()``,
and ``Object.defineProperties()`` are the only exposed interfaces where
property descriptors are explicitly exposed to user code, and also the
only places where property descriptors are converted between internal and
external forms. All other exposed interfaces deal with property
descriptors and attributes internally only. These methods always set the
``Throw`` flag to ``true``, so the exposed implementation only needs to
have a "throwing" variant.
* Property read and write handle a non-object base object with a specific
variant for the basic ``[[Get]]`` and ``[[Put]]``, defined in E5 Sections
8.7.1 and 8.7.2 (``GetValue()`` and ``PutValue()``). Property delete uses
a normal ``ToObject()`` coercion and then calls ``[[Delete]]`` normally.
Property existence check (``in``) does a type check and throws an error
if an argument is not already an object. So, coercion behavior is a bit
different in each context.
* Property References are established when parsing property access
expressions, E5 Section 11.2.1.
* Property References are used as right-hand-side values and read using
``GetValue()`` from various places:
+ Array initializer
+ Object initializer
+ Grouping operator (parentheses)
+ Property accessor (e.g. ``x['foo']['bar']``)
+ ``new`` operator
+ Function calls
+ Postfix and prefix increment/decrement
+ ``void`` operator
+ Unary operators (plus, minus, bitwise and logical NOT)
+ Binary operators (additive and multiplicative expressions, bitwise,
logical, and comparison operations)
+ ``instanceof`` operator
+ ``in`` operator
+ Conditional operator (``?:``)
+ Simple and compound assignment (right hand side)
+ Comma operator (``,``)
+ Variable declaration initializer
+ ``if``, ``do-while``, ``while``, ``for``, ``for-in``, ``with`` statements
+ ``throw`` statement
* Property references are used as left-hand-side values and written using
``PutValue()`` from various places:
+ Postfix and prefix increment/decrement
+ Simple and compound assignment
+ Variable declaration initializer
+ ``for`` and ``for-in`` statements
Detailed algorithms
===================
========================================== =================================================
Document Description
========================================== =================================================
``hobject-alg-preliminaries`` preliminary algorithm work
``hobject-alg-exoticbehaviors`` standard algorithms with exotic behaviors inlined
``hobject-alg-getprop`` property read
``hobject-alg-putprop`` property write
``hobject-alg-delprop`` property deletion
``hobject-alg-hasprop`` property existence check
``hobject-alg-instof`` ``instanceof`` operator
``hobject-alg-getownpropertydescriptor`` ``Object.getOwnPropertyDescriptor()``
``hobject-alg-defineproperty`` ``Object.defineProperty()``
``hobject-alg-defineproperties`` ``Object.defineProperties()``
========================================== =================================================
Future work
===========
* Add ES6 Proxy object or a Lua metatable-like mechanism and integrate it
into the Ecmascript algorithms in a natural way (``[[Get]]``, ``[[GetOwnProperty]]``,
``[[HasProperty]]``, and ``[[DefineOwnProperty]]`` most likely).
* Integrate other ES6 features into the basic object representation, with
possible some impact on these algorithms.
* Array fast path improvements for both reading of non-existent elements
and writing elements in general.
* Review the algorithms for robustness against internal errors such as
out of memory. The order of operations should be chosen to minimize
any inconsistency in state if an internal error occurs.