|
@ -12,18 +12,26 @@ challenging. See the following three excellent articles by Russ Cox |
|
|
for background: |
|
|
for background: |
|
|
|
|
|
|
|
|
* http://swtch.com/~rsc/regexp/regexp1.html |
|
|
* http://swtch.com/~rsc/regexp/regexp1.html |
|
|
|
|
|
|
|
|
* http://swtch.com/~rsc/regexp/regexp2.html |
|
|
* http://swtch.com/~rsc/regexp/regexp2.html |
|
|
|
|
|
|
|
|
* http://swtch.com/~rsc/regexp/regexp3.html |
|
|
* http://swtch.com/~rsc/regexp/regexp3.html |
|
|
|
|
|
|
|
|
Ecmascript regular expression set is described in E5 Section 15.10, |
|
|
Ecmascript regular expression set is described in E5 Section 15.10, |
|
|
and includes: |
|
|
and includes: |
|
|
|
|
|
|
|
|
* Disjunction |
|
|
* Disjunction |
|
|
|
|
|
|
|
|
* Quantifiers, counted repetition and both greedy and minimal variants |
|
|
* Quantifiers, counted repetition and both greedy and minimal variants |
|
|
|
|
|
|
|
|
* Assertions, negative and positive lookaheads |
|
|
* Assertions, negative and positive lookaheads |
|
|
|
|
|
|
|
|
* Character classes, normal and inverted |
|
|
* Character classes, normal and inverted |
|
|
|
|
|
|
|
|
* Captures and backreferences |
|
|
* Captures and backreferences |
|
|
|
|
|
|
|
|
* Unicode character support |
|
|
* Unicode character support |
|
|
|
|
|
|
|
|
* Unanchored matching (only) (e.g. ``/x/.exec('fooxfoo')`` matches ``'x'``) |
|
|
* Unanchored matching (only) (e.g. ``/x/.exec('fooxfoo')`` matches ``'x'``) |
|
|
|
|
|
|
|
|
Counted repetition quantifiers, assertions, captures, and backreferences |
|
|
Counted repetition quantifiers, assertions, captures, and backreferences |
|
@ -36,10 +44,14 @@ and compactness. More generally, the following prioritized requirements |
|
|
should be fulfilled: |
|
|
should be fulfilled: |
|
|
|
|
|
|
|
|
#. Ecmascript compatibility |
|
|
#. Ecmascript compatibility |
|
|
|
|
|
|
|
|
#. Compactness |
|
|
#. Compactness |
|
|
|
|
|
|
|
|
#. Avoiding deep or unbounded C recursion, and providing recursion and |
|
|
#. Avoiding deep or unbounded C recursion, and providing recursion and |
|
|
execution time sanity limits |
|
|
execution time sanity limits |
|
|
|
|
|
|
|
|
#. Regexp execution performance |
|
|
#. Regexp execution performance |
|
|
|
|
|
|
|
|
#. Regexp compilation performance |
|
|
#. Regexp compilation performance |
|
|
|
|
|
|
|
|
Further, it should be possible to leave out regexp support during |
|
|
Further, it should be possible to leave out regexp support during |
|
@ -1201,4 +1213,3 @@ Executor |
|
|
|
|
|
|
|
|
* Optimized primitive for testing a regexp (match without captures) would be |
|
|
* Optimized primitive for testing a regexp (match without captures) would be |
|
|
easy by just skipping 'save' instructions but would waste space. |
|
|
easy by just skipping 'save' instructions but would waste space. |
|
|
|
|
|
|
|
|