optionals.
This fixes some undesirable interplay between lookaheads and looping
(repeating) rules where lookaheads caused the loop to immediately
terminate _and_ discard any captures. This change adds an exception if
there is a 0-width match at the first iteration of the loop, in which
case captures will be kept, although the loop will still be terminated
(any further iteration of the loop would possibly cause an infinite
loop).
This allows optionals to work as expected when combined with lookahead.
There is also a question of whether or not optional should be
implemented as separate rule rather than as `(between 0 1 subrule)`.
These rules allow selecting from a number of sub-captures
while dropping the rest. `nth` is more succinct in many cases, but `only-tags` is
more general and corresponds to an internal mechanism already present.
The use cases involve user-expandable grammars.
For example, consider the IRC nickname specification.
> They SHOULD NOT contain any dot character ('.', 0x2E).
> Servers MAY have additional implementation-specific nickname restrictions.
To implement this, we can do something along these lines:
```janet
(def nickname @{:main '(some :allowed)
:allowed (! (+ :forbidden/dot :forbidden/user))
# for lax mode, (put nickname :forbidden/dot false)
:forbidden/dot "."
# to add your own requirements
# (put nickname :forbidden/user 'something)
:forbidden/user false})
```
Additionally, it's common in parsing theory to allow matches of the
empty string (epsilon). `true` essentially allows for this.
Note that this does not strictly add new functionality, you could
emulate this previously using `0` and `(! 0)` respectively, but this
should be faster and more intuitive.
The speed improvement primarily comes from `(! 0)` which is now a single
step.
When peg/replace or peg/replace-all are given a function to serve as the text
replacement, any captures produced by the PEG are passed as additional
arguments to that function.
Functions will be invoked with the matched text, and their result will be
coerced to a string and used as the new replacement text.
This also allows passing non-function, non-byteviewable values, which will be
converted into strings during replacement (only once, and only if at least
one match is found).
For to and thru, we need to restore eveytime through the loop since rules need
run with the right captures on the stack, especially if they have any
sort of backrefs.
This is the beginning of a system for compiler warnings. This includes
linting, deprecation notices, and other compiler warnings that are best
detected by the `compile` function and don't require the partial
evalutaion of the flychecker.
The (undef rule :tag) combinator lets a user "scope" tagged captures.
After the rule has matched, all captures with tag :tag can no longer be
refered to by their tag. However, such captures from outside
rule are kept as is. If no tag is given, all tagged captures from rule
are unreferenced. Note that this doesn't `drop` the captures, merely
removes their association with the tag. This means subsequent calls to
`backref` and `backmatch` will no longer "see" these tagged captures.
Keep a separate stack for tagged references. May cause pegs to
use more memory but makes the backref and backmatch features much more
powerful.
Also disables the second stack if backref and backmatch are not used in the peg.