Browse Source

add some internal docs, improve READMEs, fix dist script

pull/1/head
Sami Vaarala 11 years ago
parent
commit
09d5d5e7fd
  1. 22
      README.txt.dist
  2. 199
      doc/testcases.txt
  3. 60
      doc/uri.txt
  4. 9
      make_full.sh

22
README.txt.dist

@ -2,12 +2,12 @@
Duktape
=======
Duktape is a small and portable Ecmascript E5/E5.1 implementation.
It is intended to be easily embeddable into C programs, with a C API
similar in spirit to Lua's.
Duktape is a small and portable Ecmascript E5/E5.1 implementation. It is
intended to be easily embeddable into C programs, with a C API similar in
spirit to Lua's.
The goal is to support the full E5 feature set like Unicode strings
and regular expressions. Other feature highlights include:
The goal is to support the full E5 feature set like Unicode strings and
regular expressions. Other feature highlights include:
* Custom types (like pointers and buffers) for C integration
@ -45,13 +45,13 @@ To build an example command line tool, use the following::
Hello world!
= undefined
The source code should currently compile cleanly on Linux and OSX
(Darwin), for both x86 and ARM. The goal is of course to compile
on almost any reasonable platform.
The source code should currently compile cleanly on Linux, OSX (Darwin), and
FreeBSD, for both x86 and ARM. The goal is of course to compile on almost
any reasonable platform.
There is a separate tar ball for developing Duktape: it contains
internal documentation and unit tests which are not necessary to
use Duktape.
There is a separate tar ball ("full distribution") for developing Duktape.
It contains internal documentation and unit tests which are not necessary
to use Duktape.
Duktape is licensed under the MIT license (see ``LICENSE.txt``).
MurmurHash2 is used internally; it is also under the MIT license.

199
doc/testcases.txt

@ -0,0 +1,199 @@
==========
Test cases
==========
Introduction
============
There are two separate test case sets for Duktape:
1. Ecmascript test cases for testing Ecmascript compliance
2. Duktape API test cases for testing that the exposed user API works
Ecmascript test cases
=====================
How to test?
------------
There are many unit testing frameworks for Ecmascript such as `Jasmine`_
(see also `List of unit testing frameworks`_). However, when testing an
Ecmascript *implementation*, a testing framework cannot always assume
that even simple language features like functions or exceptions work
correctly.
How to do automatic testing then?
.. _Jasmine: http://pivotal.github.com/jasmine/
.. _List of unit testing frameworks: http://en.wikipedia.org/wiki/List_of_unit_testing_frameworks#JavaScript
The current solution is to run an Ecmascript test case file with a command
line interpreter and compare the resulting ``stdout`` text to expected.
Control information, including expected ``stdout`` results, are embedded
into Ecmascript comments which the test runner parses.
The intent of the test cases is to test various features of the implementation
against the specification *and real world behavior*. Thus, the tests are
*not* intended to be strict conformance tests: implementation specific
features and internal behavior are also covered by tests. However, whenever
possible, test output can be compared to output of other Ecmascript engines,
currently: Rhino, NodeJS (V8), and Smjs.
Test case scripts write their output using the ``print()`` function. If
``print()`` is not available for a particular interpretation (as is the case
with NodeJS), a prologue defining it is injected.
Test case format
----------------
Test cases are plain Ecmascript files ending with the extension ``.js`` with
special markup inside comments.
Example::
/*
* Example test.
*
* Expected result is delimited as follows; the expected response
* here is "hello world\n".
*/
/*---
{
"slow": false,
"_comment": "optional metadata is encoded as a single JSON object"
}
---*/
/*===
hello world
===*/
if (1) {
print("hello world"); /* automatic newline */
} else {
print("not quite");
}
/*===
second test
===*/
/* there can be multiple "expected" blocks (but only one metadata block) */
print("second test");
The metadata block and all metadata keys are optional. Boolean flags
default to false if metadata block or the key is not present. Current
metadata keys:
* ``slow``: if true, test is slow and increased timelimits are applied
to avoid incorrect timeout errors.
* ``skip``: if true, test is not finished yet, and a failure is not
counted towards failcount.
* ``custom``: if true, some implementation dependent features are tested,
and comparison to other Ecmascript engines is not relevant.
Practices
---------
Indentation
:::::::::::
Indent with space, 4 spaces.
Verifying exception type
::::::::::::::::::::::::
Since Ecmascript doesn't require specific error messages for errors
thrown, the messages should not be inspected or printed out in test
cases. Ecmascript does require specific error types though (such as
``TypeError``. These can be verified by printing the ``name``
property of an error object.
For instance::
try {
null.foo = 1;
} catch (e) {
print(e.name);
}
prints::
TypeError
When an error is not supposed to occur in a successful test run, the
exception message can (and should) be printed, as it makes it easier
to resolve a failing test case. This can be done most easily as::
try {
null.foo = 1;
} catch (e) {
print(e);
}
Test cases
----------
Test cases filenames consist of lowercase words delimited by dashes, e.g.::
test-stmt-trycatch.js
The first part of each test case is ``test``. The second part indicates a
major test category. The test categories are not very strictly defined, and
there is currently no tracking of specification coverage.
Test cases starting with ``test-dev-`` are development time test cases
which demonstrate a particular issue and may not be very well documented.
Test cases starting with ``test-dev-bug-`` illustrate a particular
development time bug which has usually already been fixed.
Duktape API test cases
======================
Test case format
----------------
Test case files are C files with a ``test()`` function. The test function
gets as its argument an already initialized ``duk_context *`` and print out
text to ``stdout``. The test case can assume ``duktape.h`` and common headers
like ``stdio.h`` have been included. There are also some predefined macros
(like ``TEST_SAFE_CALL()`` and ``TEST_PCALL()``) to minimize duplication in
test case code.
Expected output is defined as for Ecmascript test cases. There is currently
no metadata.
Example::
/*===
Hello world from Ecmascript!
Hello world from C!
===*/
void test(duk_context *ctx) {
duk_push_string("print('Hello world from Ecmascript!');");
duk_eval(ctx);
printf("Hello world from C!\n");
}
Test runner
===========
The current test runner is a NodeJS program which handles both Ecmascript
and API testcases. See ``runtests/runtests.js``.
Future work
===========
* Put test cases in a directory hierarchy instead (``test/stmt/trycatch.js``),
perhaps scales better (at the expense of adding hassle to e.g. grepping).
* Keep simple input-output model but add includes. There is a lot of
boilerplate now for basic things like dumping descriptors.

60
doc/uri.txt

@ -0,0 +1,60 @@
=========================
URI encoding and decoding
=========================
Specification notes
===================
Reserved set / unescaped set
----------------------------
The "unescaped set" for encoding and the "reserved set" for decoding always
consist of only ASCII codepoints. Thus comparing codepoints against the sets
should only be necessary when processing ASCII range characters.
When encoding, step 4.c will catch characters in the "unescaped set" and
encode them as-is into the output. Note that these can only be single-byte
ASCII characters. If we go to step 4.d, the codepoint may either be ASCII
or non-ASCII, and will be escaped regardless.
When decoding percent escaped codepoints, one-byte encoded codepoints (i.e.
ASCII) are checked in step 4.d.vi; multi-byte encoded codepoints in the BMP
range are checked in step 4.d.vii but codepoints above BMP are not checked.
Apparently the idea here is to ensure no characters in the reserved set are
decoded from percent escapes even if invalid UTF-8 (non-shortest) encodings
are allowed. Because characters above BMP are encoded with surrogate pairs,
the formula for surrogate pairs ensures that the codepoint cannot be below
U+00010000 (0x10000 is added to the surrogate pair bits), and thus no check
against the "reserved set" is needed.
However, at the end of Section 15.1.3:
RFC 3629 prohibits the decoding of invalid UTF-8 octet sequences. For
example, the invalid sequence C0 80 must not decode into the character
U+0000. Implementations of the Decode algorithm are required to throw a
URIError when encountering such invalid sequences.
Because "reserved set" / "unescaped set" always consists of only ASCII
codepoints, the check in step 4.d.vii should not be necessary. The UTF-8
validity check happens in step 4.d.vii.8.
Decoding characters outside BMP
-------------------------------
The URI decoding algorithm requires that UTF-8 encoded codepoints consisting
of more than 4 encoded bytes are rejected. 4 byte encoding contains 21 bits,
so the maximum codepoint which can be expressed is U+1FFFFF. However, since
the bytes must also be valid UTF-8 (step 4.d.vii.8) the highest allowed
codepoint is actually U+10FFFF.
It would be nice to be able to:
* decode higher codepoints because Duktape can represent them
* decode codepoints up to U+10FFFF without surrogate pairs
Because the API requirements are strict, these cannot be added to the standard
API without breaking compliance. Custom URI encoding/decoding functions could
provide these extended semantics.

9
make_full.sh

@ -30,6 +30,8 @@ for i in \
doc/number_conversion.txt \
doc/regexp.txt \
doc/sorting.txt \
doc/uri.txt \
doc/testcases.txt \
; do
cp --parents $i $FULL/
done
@ -43,20 +45,17 @@ for i in \
done
for i in \
examples/test.c \
examples/cmdline/duk_cmdline.c \
examples/cmdline/duk_ncurses.c \
examples/cmdline/duk_socket.c \
examples/cmdline/duk_fileio.c \
examples/coffee/mandel.js \
examples/coffee/hello.js \
examples/coffee/globals.js \
examples/coffee/Makefile \
examples/coffee/mandel.coffee \
examples/coffee/hello.coffee \
examples/coffee/globals.coffee \
examples/hello/hello.c \
examples/Makefile.cmdline \
examples/Makefile.example \
examples/hello/hello.c \
; do
cp --parents $i $FULL/
done

Loading…
Cancel
Save