Keep a separate stack for tagged references. May cause pegs to
use more memory but makes the backref and backmatch features much more
powerful.
Also disables the second stack if backref and backmatch are not used in the peg.
These capture the line and column number of the current position
in the matched text. This is useful for error reporting as well
as indentation checking.
This works by lazily creating an index on first use that stores all
newline character indices in order. We can then do a binary search on
this to get both line number and column number in log(n) time.
This is good enough for most use cases and doesn't slow down the common case at all
- these will not be commonly used patterns in a hot loop so it is not worth to try and
optimize this at all. Constant time look up should be possible but at
the cost of complicating code and slowing down all matching to check for
new lines.
Inpsired by the REBOL operators of the same name, these
combinators match bytes up to or inculding a given pattern.
(to patt) is (almost) equalivalent to (any (if-not patt 1)), and
(thru patt) is equivalent to (* (to patt) patt). The one difference
is that if the end of the input is reached and patt is not
matched, the entire pattern does not match.
This way we can support fewer build configurations. Also, remove
all undefined behavior due to use of memcpy with NULL pointers. GCC
was exploiting this to remove NULL checks in some builds.
This means we can add new properties to abstract types without
breaking old code. We can also make simple abstract types without
needing to add many NULL fields to the type.
Also make integer to size_t casts explicit rather than relying on
int32_t * sizeof(x) = size_t. This is kind of a personal preference for
this problem.
Because we use an amalgated build, feature
test macros should be set in a single file that
is included before any other headers, and is placed
at the top of the amalgamated build.
This unifies equality and comparison checking. Before, we had
separate functions and vm opcodes for comparing general values vs.
for comparing numbers, where the numberic functions were polymorphic and
had special cases for handling NaNs. By unfiying them, abstract types
can now better integrate with other number types and behave as keys.
For now, the old functions are aliased but will eventually be removed.
This adds several common patterns, which are defined in
boot.janet. This essentially gives more primitive patterns
to work with out of the box.
Fix build when JANET_REDUCED_OS is defined.
Gives more control over unmarshalling
abstract types. This should also
make it possible/easy to write abstract types that cannot
cause unmarshal to segfault.
Remove the multiple caching tables we were using
and use the grammar table for caching. This works
well because we can use raw_get for checking the local cache, and normal
get fro checking the global cache.
A keyword reference only counts as visited if we have
it as cached in the memoized->table, and we know it was
originally referenced from the same grammar table. If these
two conditions are true, then compilation must work correctly.
Also add janet_table_get_ex.
(backmatch [tag?]) is similar to a back reference in regular expressions
(NOT to backwards capture in a peg). It only matches a pattern if
it exactly matches the text of the last capture. It does not consume
or push any captures to the capture stack.
This will allow some one constructing an abstract to
only make it visible to the garbage collector after it
is in a valid state. If code in the constructing cfunction
panics before janet_abstract_end is called, the GC will not try
to mark the incomplete abstract type. This is often not needed through
careful programming, but should work well.
By holding on a reference to argv for a long time, we
may trigger a use after free bug if the stack is resized. In
janet c function, argv is only vvalid up until the next stack operation
on the fiber. We could say that this is the dynamic lifetime of
argv.
To fix this, we copy extra arguments into a tuple, which is properly
garbage collected.
Some peg grammars could not capture values based on their position in a
larger grammar. This is a design limitation inheritted from LPeg, but no
longer needed as the replace mode is superseded by the accumulator mode,
which is more general if slightly harder to use.