Some places don't need the "fast path" macro tointegerns, either
because speed is not essential (lcode.c) or because the value is not
supposed to be an integer already (luaV_equalobj and luaG_tointerror).
Moreover, luaV_equalobj should always use F2Ieq, even if Lua is
compiled to "round to floor".
More uses of macros 'likely'/'unlikely' (renamed to
'l_likely'/'l_unlikely'), both in range (extended to the
libraries) and in scope (extended to hooks, stack growth).
To-be-closed variables are linked in their own list, embedded into the
stack elements. (Due to alignment, this information does not change
the size of the stack elements in most architectures.) This new list
does not produce garbage and avoids memory errors when creating tbc
variables.
Initial implementation to allow yields inside '__close' metamethods.
This current version still does not allow a '__close' metamethod
to yield when called due to an error. '__close' metamethods from
C functions also are not allowed to yield.
The stack size is derived from 'stack_last', when needed. Moreover,
the handling of stack sizes is more consistent, always excluding the
extra space except when allocating/deallocating the array.
The previous stackless implementations marked all 'luaV_execute'
invocations as fresh. However, re-entering 'luaV_execute' when
resuming a coroutine should not be a fresh invocation. (It works
because 'unroll' called 'luaV_execute' for each call entry, but
it was slower than letting 'luaV_execute' finish all non-fresh
invocations.)
A "with stack" implementation gains too little in performance to be
worth all the noise from C-stack overflows.
This commit is almost a sketch, to test performance. There are several
pending stuff:
- review control of C-stack overflow and error messages;
- what to do with setcstacklimit;
- review comments;
- review unroll of Lua calls.
The field 'L->oldpc' is not always updated when control returns to a
function; an invalid value can seg. fault when computing 'changedline'.
(One example is an error in a finalizer; control can return to
'luaV_execute' without executing 'luaD_poscall'.) Instead of trying to
fix all possible corner cases, it seems safer to be resilient to invalid
values for 'oldpc'. Valid but wrong values at most cause an extra call
to a line hook.
Errors in finalizers need a valid 'pc' to produce an error message,
even if the error is not propagated. Therefore, calls to the GC (which
may call finalizers) inside luaV_execute must save the 'pc'.
Macro 'checkstackGC' was doing a GC step after resizing the stack;
the GC could shrink the stack and undo the resize. Moreover, macro
'checkstackp' also does a GC step, which could remove the preallocated
CallInfo when calling a function. (Its name has been changed to
'checkstackGCp' to emphasize that it calls the GC.)
Instead of an explicit value (field 'b'), true and false use different
tag variants. This avoids reading an extra field and results in more
direct code. (Most code that uses booleans needs to distinguish between
true and false anyway.)
The difference in performance between immediate operands and K operands
does not seem to justify all those extra opcodes. We only keep OP_ADDI,
due to its ubiquity and because the difference is a little more relevant.
(Later, OP_SUBI will be implemented by OP_ADDI, negating the constant.)
In arithmetic/bitwise operators, the call to metamethods is made
in a separate opcode following the main one. (The main
opcode skips this next one when the operation succeeds.) This
change reduces slightly the size of the binary and the complexity
of the arithmetic/bitwise opcodes. It also simplfies the treatment
of errors and yeld/resume in these operations, as there are much
fewer cases to consider. (Only OP_MMBIN/OP_MMBINI/OP_MMBINK,
instead of all variants of all arithmetic/bitwise operators.)
The family of opcodes OP_ADDK (arithmetic operators with K constant)
were not being handled in 'luaV_finishOp', which completes their
task after an yield.
When initializing a to-be-closed variable, check whether it has a
'__close' metamethod (or is a false value) and raise an error if
if it hasn't. This produces more accurate error messages. (The
check before closing still need to be done: in the C API, the value
is not constant; and the object may lose its '__close' metamethod
during the block.)
Instead of updating 'L->top' in every place that may call a
metamethod, the metamethod functions themselves (luaT_trybinTM and
luaT_callorderTM) correct the top. (When calling metamethods from
the C API, however, the callers must preserve 'L->top'.)
- OP_NEWTABLE can use 'ra + 1' to set top (instead of ci->top);
- OP_CLOSE doesn't need to set top ('Protect' already does that);
- OP_TFORCALL must use 'ProtectNT', to preserve the top already set.
(That was a small bug, because iterators could be called with
extra parameters besides the state and the control variable.)
- Comments and an extra test for the bug in previous item.
Open upvalues are kept alive together with their corresponding
stack. This change makes a simpler and safer fix to the issue in
commit 440a5ee78c, about upvalues in the list of open upvalues
being collected while others are being created. (That previous fix
may not be correct.)
When creating an upvalue, an emergency collection can collect the
previous upvalue where the new one would be linked. The following
code can trigger the bug, using valgrind on Lua compiled with the
-DHARDMEMTESTS option:
local x; local y
(function () return y end)();
(function () return x end)()
This commit brings a new implementation for HARDMEMTESTS, which forces
an emergency GC whenever possible. It also fixes some issues detected
with this option:
- A small bug in lvm.c: a closure could be collected by an emergency
GC while being initialized.
- Some tests: a memory address can be immediatly reused after a GC;
for instance, two consecutive '{}' expressions can return exactly the
same address, if the first one is not anchored.
OP_RETURN must update trap before updating stack. (Bug detected with
-DHARDSTACKTESTS). Also, in 'luaF_close', do not create a variable
with 'uplevel(uv)', as the stack may change and invalidate this
value. (This is not a bug, but could become one if 'upl' was used
again.)
Opcodes OP_NEWTABLE and OP_SETLIST use the same representation to
store the size of the array part of a table. This new representation
can go up to 2^33 (8 + 25 bits).
OP_NEWTABLE is followed by an OP_EXTRAARG, so that it can keep
the exact size of the array part of the table to be created.
(Functions 'luaO_int2fb'/'luaO_fb2int' were removed.)
In the generic for loop, it is simpler for OP_TFORLOOP to use the
same 'ra' as OP_TFORCALL. Moreover, the internal names of the loop
temporaries "(for ...)" don't need to leak internal details (even
because the numerical for loop doesn't have a fixed role for each of
its temporaries).
Ensure that operation macros, such as 'luai_numdiv' and 'luai_numidiv',
operate only on variables, or at most at 's2v(ra)'. ('s2v' is a nop, a
cast from pointer to pointer.)