mirror of https://github.com/svaarala/duktape.git
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
641 lines
20 KiB
641 lines
20 KiB
===========
|
|
Code issues
|
|
===========
|
|
|
|
This document covers C coding issues related to Duktape implementation
|
|
such as:
|
|
|
|
* Conventions
|
|
* Portability concerns
|
|
* Specific platforms and compilers
|
|
* Size and performance optimization issues
|
|
|
|
Conventions
|
|
===========
|
|
|
|
Indentantion, naming, etc
|
|
-------------------------
|
|
|
|
Indent with tab. On continuation lines indent with tab to shared indent
|
|
depth and then indent with spaces. For example, denoting tab indent with
|
|
colon and space indent with period::
|
|
|
|
::::::::snprintf(buf,
|
|
::::::::.........sizeof(buf),
|
|
::::::::........."%d",
|
|
::::::::.........123);
|
|
|
|
Names are lowercase, underscore separated::
|
|
|
|
void duk_func(void) {
|
|
/* ... */
|
|
}
|
|
|
|
Macros are uppercase, underscore separated::
|
|
|
|
#define DUK_MACRO(x) /* ... */
|
|
|
|
Macro names must not begin with an underscore. Macros which are of local
|
|
interest only can have a local name or have a double underscore after "DUK"::
|
|
|
|
/* 'foo' alternatives, not to be used directly */
|
|
#define DUK__FOO_ALT1 /* ... */
|
|
#define DUK__FOO_ALT2 /* ... */
|
|
|
|
/* select DUK_FOO provider */
|
|
#define DUK_FOO DUK_FOO_ALT2
|
|
|
|
There is only one space after a ``#define``, ``#ifdef``, etc, but there
|
|
may be multiple spaces between the a macro name and its definition. There
|
|
is no strict rule on the alignment of a macro value; successive definitions
|
|
usually keep values in the same column.
|
|
|
|
Comments are always traditional C comments, never ``//``::
|
|
|
|
/* always used traditional C comments */
|
|
|
|
Opening brace on the same line as the start of the construct, even
|
|
for functions::
|
|
|
|
void func(int x) {
|
|
if (x) {
|
|
/* ... */
|
|
} else {
|
|
/* ... */
|
|
}
|
|
}
|
|
|
|
The case-statements of a switch are at the same level as the switch
|
|
to reduce indent. If case clauses have their own blocks, this leads
|
|
to a confusing closing brace, so a comment for that may be in order::
|
|
|
|
switch (x) {
|
|
case A: {
|
|
/* ... */
|
|
break;
|
|
}
|
|
case B: {
|
|
/* ... */
|
|
break;
|
|
}
|
|
default: {
|
|
}
|
|
} /* switch */
|
|
|
|
Space after ``if``, ``switch``, etc::
|
|
|
|
if (x) { ... } /* correct */
|
|
if(x) { ... } /* incorrect */
|
|
|
|
switch (x) { ... } /* correct */
|
|
switch(x) { ... } /* incorrect */
|
|
|
|
Use of goto for error cleanup and shared error handling is not only
|
|
allowed but encouraged.
|
|
|
|
No naked statements in e.g. ``if-then-else``, always use a block.
|
|
This is more macro compatible. Example::
|
|
|
|
if (x) {
|
|
return 1; /* correct */
|
|
}
|
|
|
|
if (x)
|
|
return 1; /* incorrect */
|
|
|
|
Multi-statement macros should use a ``do-while(0)`` construct::
|
|
|
|
#define FROBNICATE(x,y) do { \
|
|
x = x * x; \
|
|
y = y * y; \
|
|
} while (0)
|
|
|
|
Use parentheses when referring to macro arguments and the final macro
|
|
result to minimize error proneness::
|
|
|
|
#define MULTIPLY(a,b) ((a)*(b))
|
|
|
|
/* Now MULTIPLY(1+2,3) expands to ((1+2)*(3)) == 9, not
|
|
* 1+2*3 == 7. Parentheses are used around macro result for
|
|
* similar reasons.
|
|
*/
|
|
|
|
Variable declarations
|
|
---------------------
|
|
|
|
C variables should only be declared in the beginning of the block. Although
|
|
this is usually not a portability concern but some older still compilers
|
|
require it. In particular, MSVC (at least Visual Studio 2010 Express) seems
|
|
to require this.
|
|
|
|
Be careful especially of assertions, debug prints, and other macros::
|
|
|
|
int x, y;
|
|
DUK_UNREF(y);
|
|
int flags = 0; /* problem: DUK_UNREF() */
|
|
|
|
Include guards
|
|
--------------
|
|
|
|
There are several popular include guard conventions. Leading underscores
|
|
are reserved and should be avoided in user code. The current include guard
|
|
convention is::
|
|
|
|
/* duk_foo.h */
|
|
|
|
#ifndef DUK_FOO_H_INCLUDED
|
|
#define DUK_FOO_H_INCLUDED
|
|
|
|
...
|
|
|
|
#endif /* DUK_FOO_H_INCLUDED */
|
|
|
|
See:
|
|
|
|
* http://en.wikipedia.org/wiki/Include_guard
|
|
|
|
``#pragma once`` is not portable, and is not used.
|
|
|
|
FIXME, TODO, XXX, NOTE markers
|
|
------------------------------
|
|
|
|
The following markers are used inside comments:
|
|
|
|
FIXME:
|
|
Issue should be fixed before a stable release. Does not block
|
|
an intermediate release.
|
|
|
|
TODO:
|
|
Issue should be fixed but does not block a release (even a stable
|
|
one).
|
|
|
|
XXX:
|
|
Like TODO, but it may be unclear what the proper fix is.
|
|
|
|
NOTE:
|
|
Noteworthy issue important for e.g. maintenance, but no action needed.
|
|
|
|
The markers must appear verbatim and be followed by a colon without
|
|
any space in between. This is important so that the markers can be
|
|
grep'd. Example::
|
|
|
|
/* FIXME: foo should have a different type */
|
|
|
|
Unused variables
|
|
----------------
|
|
|
|
Suppressing unused variable warnings use the following macro::
|
|
|
|
DUK_UNREF(my_unused_var);
|
|
|
|
Internally, this currently uses the form::
|
|
|
|
(void) my_unused_var; /* suppress warning */
|
|
|
|
This seems to work with both GCC and Clang. The form::
|
|
|
|
my_unused_var = my_unused_var; /* suppress warning */
|
|
|
|
works with GCC but not with Clang.
|
|
|
|
Unreachable code and "noreturn" functions
|
|
-----------------------------------------
|
|
|
|
Noreturn functions must have a void return type and are declared as::
|
|
|
|
DUK_NORETURN(void myfunc(void));
|
|
|
|
The macro style is awkward but is not easy to implement in another way.
|
|
|
|
Unreachable points in code are declared as::
|
|
|
|
DUK_UNREACHABLE();
|
|
|
|
Likely/unlikely comparisons
|
|
---------------------------
|
|
|
|
Providing "branch hints" may provide benefits on some platforms but not on
|
|
others. ``DUK_LIKELY()`` and ``DUK_UNLIKELY()`` can always be used in code,
|
|
and will be defined as a no-op if using branch hints on the target platform
|
|
is not possible or useful.
|
|
|
|
``DUK_UNLIKELY()`` should be used at least for conditions which are almost
|
|
never true, like invalid API call arguments, string size overflows, etc::
|
|
|
|
if (DUK_UNLIKELY(ptr == NULL)) {
|
|
/* ... */
|
|
}
|
|
|
|
Similarly, ``DUK_LIKELY()`` should be used for conditions which are almost
|
|
always true::
|
|
|
|
if (DUK_LIKELY(ptr != NULL)) {
|
|
/* ... */
|
|
}
|
|
|
|
The argument to these macros must be an integer::
|
|
|
|
/* correct */
|
|
if (DUK_LIKELY(ptr != NULL)) {
|
|
/* ... */
|
|
}
|
|
|
|
/* incorrect */
|
|
if (DUK_LIKELY(ptr)) {
|
|
/* ... */
|
|
}
|
|
|
|
C++ compatibility
|
|
-----------------
|
|
|
|
The source code is meant to be C++ compatible so that you can both:
|
|
|
|
1. Compile Duktape with C but use it from C++.
|
|
|
|
2. Compile Duktape with C++ and use it from C++.
|
|
|
|
To achieve this:
|
|
|
|
* Avoid variable names conflicting with C++ keywords (``throw``,
|
|
``class``, ``this``, etc).
|
|
|
|
* Use explicit casts for all pointer conversions.
|
|
|
|
(Compiling Duktape as a C++ program doesn't currently work fully due to
|
|
feature detection issues.)
|
|
|
|
Portability concerns
|
|
====================
|
|
|
|
Strict aliasing rules
|
|
---------------------
|
|
|
|
Strict aliasing rules and prohibition of dereferencing type-punned pointers
|
|
are good for portability so the implementation should adhere to the common
|
|
rules, e.g. use a union to convert between types. Sometimes this is not
|
|
straightforward. For instance, the indirect realloc approach currently in
|
|
use needs a getter callback to avoid type-punning.
|
|
|
|
Current goal is to compile and work without warnings even with strict
|
|
aliasing rules enforced.
|
|
|
|
Numeric types
|
|
-------------
|
|
|
|
C data types, especially integer types, are a bit of a hassle: the best choice
|
|
of types depends on the platform and the compiler, and also the C specification
|
|
version. Types also affect e.g. printf() and scanf() format specifiers which
|
|
are, of course, potentially compiler specific. To remain portable, (almost)
|
|
all C types are wrapped behind a typedef. Because both Duktape internals and
|
|
the public ``duktape.h`` header need type wrappers, the current approach is that
|
|
``duktape.h`` performs whatever feature detection is necessary to define types
|
|
needed in the public API. ``duk_features.h`` then completes the process for the
|
|
remaining internal types.
|
|
|
|
Basic rules in implementation:
|
|
|
|
* ``duktape.h`` and ``duk_features.h`` perform all the detection needed and
|
|
provide typedefs for types used in the public API and inside Duktape.
|
|
|
|
* C99 types are **not** used directly, wrapper types are used instead. For
|
|
instance, use ``duk_uint32_t`` instead of ``uint32_t``. Wrapper types are
|
|
used because we don't want to rely on C99 types or define them if they are
|
|
missing.
|
|
|
|
* Only use ``duk_XXX_t`` typedefs for integer types unless there is a special
|
|
reason not to. For instance, if a platform API requires a specific type,
|
|
that type must of course be used (or casted to).
|
|
|
|
* Use ``duk_size_t`` for internal uses of ``size_t``. Coerce it explicitly
|
|
to ``size_t`` for library API calls.
|
|
|
|
* Use ``duk_double_t`` for IEEE double precision float. This is slight
|
|
paranoia but may be handy if e.g. built-in soft float library is introduced.
|
|
|
|
* The ``void`` type is used as is, cannot imagine a reason why it would need
|
|
to be reassigned for portability.
|
|
|
|
* Use ``duk_int_t`` as an ``int`` replacement; it behaves like an ``int`` but,
|
|
unlike ``int``, is guaranteed to be at least 32 bits wide. Similarly
|
|
``duk_uint_t`` should be used as an ``unsigned int`` replacement.
|
|
|
|
* The ``duk_small_int_t`` should be used in internal code e.g. for flags.
|
|
It is guaranteed to be 16 bits or more. Similarly ``duk_small_uint_t``.
|
|
It's also used for boolean values.
|
|
|
|
* ``duk_uint8_t`` should be used as a replacement for ``unsigned char`` and
|
|
often for ``char`` too. Since ``char`` may be signed, it is often a
|
|
problematic type when comparing ranges, indexing lookup tables, etc, so
|
|
a ``char`` or a ``signed char`` is often not the best type. Note that
|
|
proper string comparison of UTF-8 character strings, for instance, relies
|
|
on unsigned byte comparisons.
|
|
|
|
* Integer constants should generally use ``L`` or ``UL`` suffix, i.e.
|
|
makes them ``long int`` or ``unsigned long int``, and they are
|
|
guaranteed to be 32 bits or more. Without a suffix integer constants
|
|
may be only 16 bits. 64-bit constants need ``LL`` or ``ULL`` suffix.
|
|
Small constants (16 bits or less) don't need a suffix and are still
|
|
portable. This is convenient for codepoint constants and such.
|
|
|
|
* Integer constant sign should match the type the constant is related to.
|
|
For instance, ``duk_codepoint_t`` is a signed type, so a signed constant
|
|
should be used. This is more than a style issue: suppose signed codepoint
|
|
``cp`` had value ``-1``. The comparison ``(cp < 0x7fL)`` is true while
|
|
the comparison ``(cp < 0x7fUL)`` is false because of C coercion rules.
|
|
|
|
* **FIXME:** Format specifiers are under work.
|
|
|
|
* **FIXME:** normal vs. fast variables: use tight values in structs,
|
|
"fast" values as e.g. loop counters in fast paths (character / byte
|
|
iteration loops etc)
|
|
|
|
* **FIXME**: flags field type (storage vs. internal APIs)
|
|
|
|
* **FIXME**: avoid casting when unnecessary
|
|
|
|
* **FIXME**: type names, end in _t for internal types, no _t for public API
|
|
|
|
Random notes:
|
|
|
|
* The public API uses types at least for these (example type in parentheses):
|
|
|
|
- allocation size, entry count, etc (size_t)
|
|
- Unicode codepoint (int_fast32_t)
|
|
- value stack index (int_fast32_t, ssize_t)
|
|
- value stack count (uint_fast32_t, size_t)
|
|
- flag field (uint_fast32_t)
|
|
- boolean flag (int)
|
|
- Ecmascript array index (uint_fast32_t)
|
|
- Ecmascript number (double)
|
|
- Void and char pointers; easy, but sign of ``char`` varies:
|
|
explicit use of ``unsigned char`` is more portable but
|
|
more verbose
|
|
|
|
* A large amount of code needs an integer type which is fastest on the platform
|
|
but still guaranteed to be 32 bits or more. The ``int`` type is NOT a good
|
|
choice because it may be 16 bits even on platforms with a 32-bit type and
|
|
even 32-bit registers (e.g. PureC on M68K). The ``long`` type is also not a
|
|
good choice as it may be too wide (e.g. GCC on x86-64, int is 32 bits while
|
|
long is 64 bits). For this use, there are two typedefs: ``duk_int_t`` and
|
|
``duk_uint_t``. For small integers, like flags, there are typedefs
|
|
``duk_small_int_t`` and ``duk_small_uint_t`` which explicitly indicate that
|
|
a small integer (16 bits or more) suffices.
|
|
|
|
* Exact 32-bit types are needed in some cases e.g. for Ecmascript semantics.
|
|
Also, 64-bit arithmetic emulation (implemented on 32 bit types) relies on
|
|
exact unsigned overflows / underflows. The wrapped C99 types are used in
|
|
these cases.
|
|
|
|
Alignment
|
|
---------
|
|
|
|
Platforms vary in their alignment requirements:
|
|
|
|
* Some platforms cause an error ("bus error") when alignment requirements
|
|
are violated. Such platforms may have unaligned access instructions but
|
|
unaligned accesses may need to be flagged to the compiler.
|
|
|
|
* Some platforms have slower unaligned accesses but which behave externally
|
|
just like aligned accesses. "Slower" may mean that an interrupt / trap
|
|
handler is invoked, at a considerable penalty.
|
|
|
|
* Some platforms support aligned and unaligned accesses with more or less
|
|
the same performance.
|
|
|
|
Alignment level may also vary, e.g. platform may require 4-byte alignment
|
|
for both 32-bit integers and IEEE doubles, or it may require 4-byte alignment
|
|
for 32-bit integers but 8-byte alignment for doubles, etc.
|
|
|
|
**FIXME: alignment is now guaranteed to 4 bytes on platforms where unaligned
|
|
accesses are not allowed/preferable.**
|
|
|
|
64-bit arithmetic
|
|
-----------------
|
|
|
|
Some compilers on 32-bit platforms may have 64-bit arithmetic problems
|
|
(this seems to be the case with VBCC for example). There are also older
|
|
compiles with no 64-bit support at all.
|
|
|
|
Duktape must compile with only 32-bit operations if necessary, so
|
|
replacements are needed in the few places where 32 bits are not enough.
|
|
|
|
Integer overflows
|
|
-----------------
|
|
|
|
Signed integer overflows are undefined behavior:
|
|
|
|
* https://www.securecoding.cert.org/confluence/display/seccode/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow?showComments=false
|
|
|
|
At least unsigned overflow handling is important, as it is needed to make
|
|
"add with carry" etc convenient.
|
|
|
|
Detecting overflow in simple addition is straightforward when unsigned
|
|
integer type bit size is exact::
|
|
|
|
duk_uint32_t x, y, z;
|
|
/* ... */
|
|
z = x + y;
|
|
if (z < x) {
|
|
/* Overflow: (z < x) or equivalently (z < y) cannot be true unless
|
|
* overflow occurs. This relies on unsigned overflow behavior and
|
|
* an exact bit size for the type.
|
|
*/
|
|
}
|
|
|
|
Detecting overflow in multiplication is a bit trickier. This comes up
|
|
e.g. in array join/concat helper when it computes the combined size of
|
|
separators (separator_size times separator_count). The check is easy
|
|
if a larger type is available::
|
|
|
|
duk_uint32_t x, y, z;
|
|
duk_uint64_t t;
|
|
|
|
t = (duk_uint64_t) x * (duk_uint64_t) y;
|
|
if (t >= (duk_uint64_t) LIMIT) {
|
|
/* Overflow. */
|
|
}
|
|
z = (duk_uint32_t) t;
|
|
|
|
However, for portability a 64-bit type cannot (for instance) be assumed.
|
|
The following approach works without a larger temporary type, but is
|
|
conservative and may indicate overflow even when one wouldn't occur::
|
|
|
|
/*
|
|
* Basic idea:
|
|
*
|
|
* x * y > limit // limit is e.g. 2**32-1
|
|
* <=> x > limit / y // y != 0
|
|
* <=> y > limit / x // equivalent, x != 0
|
|
*
|
|
* When a truncating division is used on the right size, the result
|
|
* is no longer equivalent:
|
|
*
|
|
* x > floor(limit / y) <== x > limit / y // not ==>
|
|
*
|
|
* Limit must fit into the integer type.
|
|
*/
|
|
|
|
duk_uint32_t x, y, z;
|
|
|
|
if (y != 0 && x > (duk_uint32_t) 0xffffffffU / y) {
|
|
/* Probable overflow. */
|
|
}
|
|
z = x * y;
|
|
|
|
For 32-bit types the check is actually exact, see test in::
|
|
|
|
misc/c_overflow_test.py
|
|
|
|
String handling
|
|
---------------
|
|
|
|
snprintf buffer size
|
|
::::::::::::::::::::
|
|
|
|
NUL terminator behavior for snprintf() (and its friends) is inconsistent
|
|
across implementations. Some ensure a NUL terminator added when truncated
|
|
(unless of course the buffer size is zero) while others do not.
|
|
The most portable way seems to be to::
|
|
|
|
char buf[256];
|
|
snprintf(buf, sizeof(buf), "format", args);
|
|
buf[sizeof(buf) - 1] = (char) 0;
|
|
|
|
Using sizeof(buf) - 1 for size may cause a NUL terminator to appear at
|
|
the second to last character of buf in some implementations.
|
|
|
|
Examples of snprintf() calls which don't NUL terminate on truncation:
|
|
|
|
* Windows ``_snprintf()``: http://msdn.microsoft.com/en-us/library/2ts7cx93.aspx
|
|
|
|
s(n)printf %s and NULL value
|
|
::::::::::::::::::::::::::::
|
|
|
|
Giving a NULL argument to ``%s`` format string may cause a segfault in some
|
|
old compilers. Avoid NULL values for ``%s``.
|
|
|
|
Use of sprintf vs. snprintf
|
|
:::::::::::::::::::::::::::
|
|
|
|
Use snprintf instead of sprintf by default, even when legal output size is
|
|
known beforehand. There can always be bugs in the underlying standard library
|
|
implementation. Sometimes the output size is known to be limited because
|
|
input values are known to be constrained (e.g. year values are kept between
|
|
[-999999,999999]). However, if there is a bug, it's better to corrupt a
|
|
printed output value than to cause a memory error.
|
|
|
|
Other considerations
|
|
====================
|
|
|
|
Const qualifiers for tables
|
|
---------------------------
|
|
|
|
Using ``const`` for tables allows tables to compiled into the text section.
|
|
This is important on some embedded platforms where RAM is tight but there
|
|
is more space for code and fixed data.
|
|
|
|
Feature defines
|
|
===============
|
|
|
|
Almost all feature detection is concentrated into ``duk_features.h`` which
|
|
considers inputs from various sources:
|
|
|
|
* ``DUK_OPT_xxx`` defines, which allow a user to request a specific feature
|
|
or provide a specific value (such as traceback depth)
|
|
|
|
* Compiler and platform specific defines and features
|
|
|
|
As a result, ``duk_features.h`` defines ``DUK_USE_xxx`` macros which enable
|
|
and disable specific features and provide parameter values (such as traceback
|
|
depth). These are the **only** feature defines which should be used in
|
|
internal Duktape code.
|
|
|
|
The only exception so far is ``DUK_PANIC_HANDLER()`` in ``duk_error.h`` which
|
|
can be directly overridden by the user if necessary.
|
|
|
|
This basic approach is complicated a bit by the fact that ``duktape.h`` must
|
|
do some minimal platform feature detection to ensure that the public API uses
|
|
the correct types, etc. These are coordinated with ``duk_features.h``;
|
|
``duk_features.h`` either uses whatever ``duktape.h`` ended up using, or does
|
|
its own checking and ensures the two are consistent.
|
|
|
|
When adding specific hacks and workarounds which might not be of interest
|
|
to all users, add a ``DUK_OPT_xxx`` flag for them and translate it to a
|
|
``DUK_USE_xxx`` flag in ``duk_features.h``. If the ``DUK_OPT_xxx`` flag
|
|
is absent, the custom behavior MUST NOT be enabled.
|
|
|
|
Platforms and compilers
|
|
=======================
|
|
|
|
VBCC
|
|
----
|
|
|
|
Even in C99 mode VBCC 0.9b:
|
|
|
|
* Does not have ``inttypes.h``.
|
|
|
|
* Does not have ``fpclassify()`` and friends.
|
|
|
|
* Does not have ``NAN`` or ``INFINITY``.
|
|
|
|
* The expression ``0.0 / 0.0`` causes a warning and results in ``0.0``
|
|
instead of ``NaN`` as expected.
|
|
|
|
* The expression ``1.0 / 0.0`` causes a warning and results in ``0.0``
|
|
instead of infinity as expected.
|
|
|
|
The following program demonstrates the NaN issue::
|
|
|
|
#include <stdio.h>
|
|
|
|
void main(void) {
|
|
double z = 0.0;
|
|
double t;
|
|
volatile union {
|
|
double d;
|
|
unsigned char b[8];
|
|
} u;
|
|
int i;
|
|
|
|
/* this results in 0.0 */
|
|
t = 0.0 / 0.0;
|
|
printf("result: %lf\n", t);
|
|
|
|
/* this results in NaN */
|
|
t = z / z;
|
|
printf("result: %lf\n", t);
|
|
|
|
u.d = t;
|
|
for (i = 0; i < 8; i++) {
|
|
printf("%02x\n", u.b[i]);
|
|
}
|
|
}
|
|
|
|
To work with compiler optimization, the above approach needs to have the
|
|
``double`` values in ``volatile`` variables. Otherwise VBCC will end up
|
|
replacing the result with zero. So something like this is probably safest::
|
|
|
|
volatile double a = 0.0;
|
|
volatile double b = 0.0;
|
|
double t = a / b; /* -> NaN */
|
|
|
|
tcc
|
|
---
|
|
|
|
Tcc has trouble with negative zeroes. See ``misc/tcc_zerosign1.c``. For
|
|
instance:
|
|
|
|
* Assign d = 0.0
|
|
|
|
* Assign d = -d
|
|
|
|
* Now d should be a negative zero, but in tcc (with default options) it
|
|
has not changed sign: the memory dump verified this, signbit() returns
|
|
zero, etc.
|
|
|
|
This happens at least in tcc versions 0.9.25, 0.9.26.
|
|
|
|
clang
|
|
-----
|
|
|
|
Clang has some issues with union aliasing. See ``misc/clang_aliasing.c``.
|
|
|