Browse Source

Documentation updates for regexp literal braces

* Website non-standard regexp section

* Website link to wiki typescript compatibility page

* Internal documentation updates

* Emscripten compatibility status update

* Comment out unnecessary Emscripten fixups
pull/547/head
Sami Vaarala 9 years ago
parent
commit
8e4894f8eb
  1. 5
      doc/emscripten-status.rst
  2. 14
      doc/regexp.rst
  3. 21
      util/fix_emscripten.py
  4. 7
      website/guide/compatibility.html
  5. 25
      website/guide/custombehavior.html

5
doc/emscripten-status.rst

@ -15,7 +15,10 @@ Tweaks needed:
* ``--memory-init-file 0``: don't use an external memory file.
* Some RegExps need to be fixed, see ``util/fix_emscripten.py``.
* Emscripten expects a function's ``.toString()`` to match a certain
pattern which is not guaranteed (and Duktape doesn't match), see
``util/fix_emscripten.py``. Since Duktape 1.5.0 non-standard regexp
fixes for unescaped curly braces are no longer needed.
Normally this suffices. If you're running Duktape with a small amount of
memory (e.g. when running the Duktape command line tool with the ``-r``

14
doc/regexp.rst

@ -380,6 +380,20 @@ Empty quantifier bodies in complex quantifiers
This problem could also be fixed for complex quantifiers, but the
fix is not as trivial as for simple quantifiers.
Non-standard RegExp syntax in existing code
:::::::::::::::::::::::::::::::::::::::::::
Some Ecmascript code bases depend on non-standard RegExp syntax, such as
using literal braces without escaping::
/{(\d+)}/ non-standard
/\{(\d+)\}/ standard
Duktape's regexp engine supports a few non-standard expressions to reduce
issues with existing code. A longer term, more flexible solution is to
allow the built-in minimal engine to be replaced with an external engine
with wider regexp syntax, better performance, etc.
Miscellaneous
:::::::::::::

21
util/fix_emscripten.py

@ -12,28 +12,33 @@ replacements = {
# RegExp fix, now fixed in the Emscripten repository and should no longer
# be necessary.
# https://github.com/kripken/emscripten/commit/277ac5239057721ebe3c6e7813dc478eeab2cea0
r"""if (/<?{ ?[^}]* ?}>?/.test(type)) return true""":
r"""if (/<?\{ ?[^}]* ?\}>?/.test(type)) return true""",
# Duktape 1.5.0: no longer needed with non-standard regexp curly brace support
#r"""if (/<?{ ?[^}]* ?}>?/.test(type)) return true""":
# r"""if (/<?\{ ?[^}]* ?\}>?/.test(type)) return true""",
# GH-11: Another RegExp escaping fix.
r"""var sourceRegex = /^function\s\(([^)]*)\)\s*{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?}$/;""":
r"""var sourceRegex = /^function\s\(([^)]*)\)\s*\{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?\}$/;""",
r"""var sourceRegex = /^function\s*\(([^)]*)\)\s*{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?}$/;""":
r"""var sourceRegex = /^function\s*\(([^)]*)\)\s*\{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?\}$/;""",
# Duktape 1.5.0: no longer needed with non-standard regexp curly brace support
#r"""var sourceRegex = /^function\s\(([^)]*)\)\s*{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?}$/;""":
# r"""var sourceRegex = /^function\s\(([^)]*)\)\s*\{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?\}$/;""",
#r"""var sourceRegex = /^function\s*\(([^)]*)\)\s*{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?}$/;""":
# r"""var sourceRegex = /^function\s*\(([^)]*)\)\s*\{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?\}$/;""",
# GH-11: Attempt to parse a function's toString() output with a RegExp.
# The RegExp makes invalid assumptions and won't parse Duktape's function
# toString output ("function empty() {/* source code*/)}").
# This stopgap will prevent a 'TypeError: invalid base reference for property read'
# and allows at least a hello world to run.
# Still needed with Duktape 1.5.0 because the issue is what Emscripten
# expects from .toString() of a function.
r"""var parsed = jsfunc.toString().match(sourceRegex).slice(1);""":
r"""var parsed = (jsfunc.toString().match(sourceRegex) || []).slice(1);""",
r"""jsfunc.toString().match(sourceRegex).slice(1);""":
r"""(jsfunc.toString().match(sourceRegex) || []).slice(1);""",
# Newer emscripten has this at least with -O2
r"""/^function\s*\(([^)]*)\)\s*{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?}$/""":
r"""/^function\s*\(([^)]*)\)\s*\{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?\}$/""",
# Duktape 1.5.0: no longer needed with non-standard regexp curly brace support
#r"""/^function\s*\(([^)]*)\)\s*{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?}$/""":
# r"""/^function\s*\(([^)]*)\)\s*\{\s*([^*]*?)[\s;]*(?:return\s*(.*?)[;\s]*)?\}$/""",
}
repl_keys = replacements.keys()

7
website/guide/compatibility.html

@ -59,7 +59,8 @@ Javascript. There are no known issues.</p>
<p><a href="https://github.com/Microsoft/TypeScript/">TypeScript</a>
compiles to Javascript. There are no known issues with compiling TypeScript
using the Microsoft TypeScript compiler (in the ES5/CommonJS mode) and
running the resulting Javascript using Duktape.</p>
running the resulting Javascript using Duktape. It's also possible to
<a href="http://wiki.duktape.org/CompatibilityTypeScript.html">run the TypeScript compiler with Duktape</a>.</p>
<h2 id="compatibility-underscorejs">Underscore.js</h2>
@ -93,8 +94,10 @@ support yet, no "heap object" can be provided.</p>
<p><a href="https://github.com/kripken/emscripten">Emscripten</a> compiles
C/C++ into Javascript. Duktape is currently Emscripten compatible except
for a few RegExp issues, see:
for an assumption about the format of a function's <code>toString()</code>
output, see:
<a href="https://github.com/svaarala/duktape/blob/master/util/fix_emscripten.py">fix_emscripten.py</a>.
Since Duktape 1.5.0 fixes for non-standard regexps are no longer needed.
</p>
<p>As of Duktape 1.3 there is support for Khronos/ES6 TypedArray which improves

25
website/guide/custombehavior.html

@ -93,16 +93,27 @@ binding in any of the points A, B, or C.</p>
<h2>RegExp leniency</h2>
<p>Although not allowed by E5.1, the following escape is allowed in RegExp
syntax:</p>
<p>Most Ecmascript engines support more syntax than guaranteed by the
<a href="http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.1">Ecmascript
E5.1 specification (Section 15.10.1 Patterns)</a>. As a result there's quite
a lot of code that won't work with strict Ecmascript regexp syntax. Duktape also
allows some non-standard syntax to better support existing code (you can turn
this non-standard behavior off using config options if you prefer).</p>
<p>Curly braces (<code>{</code> and <code>}</code>) are treated as literals
when they don't parse as a valid quantifier:</p>
<pre>
/\$/ /* matches dollar literally, non-standard */
/\u0024/ /* same, standard */
/{(\d+)}/ /* left curly, digits, right curly; non-standard */
/\{(\d+)\}/ /* same, standard */
</pre>
<p>This escape occurs in real world code so it is allowed. (More leniency
will be added in future versions to deal with real world RegExps; dollar
escapes are not the only issue.)</p>
<p>Escaping a dollar sign as <code>\$</code> is not allowed by E5.1, but
is accepted by Duktape:</p>
<pre>
/\$/ /* matches dollar literally; non-standard */
/\u0024/ /* same, standard */
</pre>
<h2>Array.prototype.splice() when deleteCount not given</h2>

Loading…
Cancel
Save