TODO:
- struct return values
- support for unions in signatures
- more testing
- complex types
- packed structs
- writing structs to buffers (useful and we have most of the machinery).
Add support for integer return and floating point return variants, as
well as arguments on the stack. Start flushing out struct arguments.
Still needed:
- structs (packed and unpacked)
- complex numbers
- long doubles
- MASM alternative for windows (you can technically use sysv abi on
windows)
- more calling conventions.
FFI may be best implemented as an external library
(libffi has incompatible license to Janet) or as code
that takes void * and wraps then into Janet C functions
given a function signature. Either way, we need to some way
to load symbols from arbitrary dynamic libraries.
For to and thru, we need to restore eveytime through the loop since rules need
run with the right captures on the stack, especially if they have any
sort of backrefs.
While generally we are not in the business of making a very chatty
compiler, this is a simple improvement that involves compiling
metadata before the binding, as well as adding a suggestion for `defn`
in case the compiler encounters an unexpected tuple.
Added some backticks around code in docstrings to distinguish them from prose.
Light editing to `table/raw-get`. Is my change there correct? (t --> the key)
Buffers make more sense for this function because one of their primary
use cases is working with bytes.
The tuple implementation was an array of floats, which is less
performant and ergonomic for common operations. (i.e: bit manipulation)
Buffers also have the advantage they are mutable, meaning the user
can write ints to an existing buffer.
Previously int/to-number would fail if the input was outside
the range of an int32.
Because Janet numbers are doubles,
they can safely store larger ints than an int32.
This commit updates int/to-number to restrict the
value to the range of integers a double can hold, instead of an int32.
(int/to-number value) converts an s64 or u64 to a number.
It restricts the value to the int32 range,
so that `int32?` will always suceeded when called on the result.
This is more intuitive and avoids the possibilty of strange code
to resume or cancel a fiber after it was scheduled but before it was
entered for the first time.
The main issue was cancellation of fiber using `cancel` rather than
`ev/cancel` could cause issues with the event loop internal ref count.
Since this is almost certainly bad usage (and is not something I want to
encourage or support), we will warn against trying to resume or error
fibers that have already been suspended or scheduled on the event loop.
The distinction between "task" fibers and normal fibers is now kept by a
flag that is set when a fiber is resumed - if it is the outermost fiber
on the stack, it is considered a root fiber. All fibers scheduled with
ev/go or by the event loop are root fibers, and thus cannot be cancelled
or resumed with `cancel` or `resume` - instead, use `ev/cancel` or
`ev/go`.
Nested expression in the quasiquote were being compiled with the "hint"
flag passed to the expression compilation, essentially telling the
compiler to put intermediates into the final slot, possibly overwriting
other intermediate values. This fix removes that flags on any recursive
calls to quasiquote.
The current destructure pattern ends when '& is encountered.
This commit adds an error if it is followed by more than
a symbol to bind the array to.
Although its not critical since the extra items can be ignored,
they're a sign of some kind of mistake so its best to complain.
In destructure janet_type(_) == JANET_SYMBOL was used to check if a
value was a symbol.
This commit replaces that with the janet_checktype function,
because that function is used for the same purpose in other places.
This commit adds three checks to ensure & rest patterns are valid:
1. When checking for '& ensure the value is a symbol before unwrapping
2. Make sure '& is followed by a value
3. Make sure the value following '& is a symbol
Add support for using [& rest] to match the remaining values
in an array or tuple when destructuring.
the rest pattern is implemented by pushing remaining values in the
rhs to the stack once & is found on the lhs.
Then tuple is called and the result is assigned
to the next symbol on the lhs.
This commit DOES NOT implement handling for malformed patterns.
handles returned by CreateFileA and FILE_FLAG_OVERLAPPED
support reading from arbitrary offsets.
The offset is passed to ReadFile in through the OVERLAPPED structure.
Since state->overlapped is zeroed ev_machine_read
ReadFile would always read from the start of the file and never finish
This commit changes ev_machine_read to update the offset to
the number of bytes read before calling ReadFile.
- Change the global binding name from :redefs to :redef
- Simplify internal representation of "redefinable bindings"
- Store "redefinable bindings in :ref rather than :value inside the
environment entries. This makes such bindings more like vars that
can't be set rather than defs.
Rather than manual reference counting for suspended fibers, we
automate the process by incrementing "extra_listeners" every time
we suspend a fiber in the event loop, and decrement when that fiber
is resumed. In this manner, we keep track of the number of suspending
fibers in a simpler, more correct way.
Try to have better behavior when mixing sub-hashes that are not uniform and
randomly distributed. Premultiply by a large prime before mixing to
"spread entropy" if it is concentrated in a certain subset of bits.
This doesn't destory the pid until the original thread decides to
call waitpid again. Since the pid is exposed in the C API and now
in the Janet API, we don't want to destroy it until we are ready.
This at least means users can use something like jsys
or the kill command to signal processes when they want
to send unsupported signals (like SIGTERM).
We were casting a pointer to the wrong type, which caused all sorts of
wonderful chaos, but only on windows and only when the garbage collector
ran after setting up a server in a specific configuration. We were
casting a closure pointer to an abstract type during the mark phase,
which resulted in memory corruption.
We did not allow arbitrary utf8 to be printed with %j, even though the parser
allows. Thos changes uses the existing built in utf8 detectiotion to
exclude only unprintable symbols from the docstring.
Priorly we only checked exactly one state when an event was received.
This was incorrect. A state may have a next state, an action to take
after the first in the list of states has been taken. This change
acknowledges that and makes the code work with the state list vs just
the head of the list.
Don't use a timer filter, just set the timeout on each call to kevent.
Should hopefully work around the 1ms minimum on NetBSD and be possibly
more performant.
FreeBSD is the only BSD supporting ABSTIME timers, whereas the rest
demand intervals. Janet operates on timestamps, which are absolute
times, as per ABSTIME. The idea was to use that under FreeBSD but not
the other BSDs. This commit changes that since ABSTIME breaks when the
timeout supplied is for a time prior to whatever the time is
now (invalid argument). We now utilize the same logic we use on the
other BSDs with FreeBSD to effect interval timeouts since intervals are
absolutely sometime beyond now, be it now and less than a millisecond,
or more than a millisecond. This will hopefully unbreak BSD builds when
running the test suite.
After the UB was fixed in value.c, I tried running the build again and encoutered another instance of UB in gc.c. With this fixed I can now build janet with ubsan enabled, meaning there's no more UB encountered in janet_boot during the build.
The `janet_get_addrinfo` function retained code that was meant for
compliance with 3 separate function signatures under a single function
name. Changing things to be a single function signature was broken until
the code pertaining to the aforementioned was stripped out.
Prior commits was an attempt to make this one function adhere to 3
different function signatures! This puts an end to that and makes it
where it's a single function signature and if one wants to use the 4th
argument they'll need to explicitly set the 3rd argument (to nil for
default).
janet_get_sockettype expects a keyword but we're making it optional that
the call to the functions that use it with arity >=3 will be guaranteed
to have it as a keyword value! If it's not a keyword then it's the same
as NULL.
Primarily because trying to check the value results in a panic when the
value is not the type of value requested from the API. Also probably
cheaper and the previous idea of just getting the value then comparing
was pretty stupid (needed a string comparison... and was going to do
pointer comparison).
This will allow us to set the address we use for outgoing connections.
Builds, haven't checked it passes current tests, haven't checked it
actually works either.
Minimum interval for a timer must be 1 or more (or we get EINVAL) and
Janet fails tests and halts events that the programmer may still be
interested in.
A comptime known value of 0 for data in EV_SET with EVFILT_TIMER causes
a complete compilation failure (fails to link). This fixes it by making
it a 1 instead of a 0 for amount of milliseconds in the interval to wait
under NetBSD.
NetBSD and OpenBSD lack NOTE_ABSTIME and NOTE_MSECONDS, so we define
those and create a macro that we use for all timeout values in EV_TIMER
events that will on all BSD excepting FreeBSD change an absolute time
into an interval.
Checking throught NetBSD's man pages, excepting for NetBSD-current,
NetBSD uses `intptr_t` as the type for `.udata`. This change allows for
`.udata` to match whatever type (by cast) the underlying system uses.
Pretty obvious I thought control statements were glued to their opening
parenthesis at first and then I realized not and voila, a bundle of
mixed style. Hopefully this fixes all of it.
From this point things should be bug fixes or code formatting most
likely.
Updated commentary (removed superfluous comments, and commented out
code). Refined commentary where it seemed important and may help whoever
comes behind me keep from making bad assumptions similar to the ones I
made.
All tests ran with `gmake test` now pass. `valgrind` with FreeBSD does
not support forking so `gmake valtest` fails once child processes are
started. Determined not an issue, can't fix valgrind.
Need to guard against errors when reading/writing probably, if there is
an error, forgo those events.
Guard against null state (and the byproduct, a segfault), check if the
state is null before utilizing it.
Note that this is a work in progress and simply a first attempt at
getting some code into place before being able to test it. This code
follows of sorts both the poll and epoll sections of the codebase hoping
to achieve the exact same.
Relax check that number of closure environments in a function matches
that of the def.
The def could be partially constructed, and so there may be a false
negative. The runtime will check that this is consistent, and the
garbage collector should handle when this constraint is not kept.
A threaded abstract is an abstract type that can be freely shared
between threads. While no synchronization is provided, refcounting
and transport between threads is. This will let implementers more easily
exploit OS-level parallelism in C library code. The caveat with these
types is that they need to be careful in how they interact with objects
on other heaps.
Some pointer casting with abstract types was incorrect, resulting
in strange behavior when trying to use supervisor channels that were
threaded. This fix also adds the ability to supply a supervisor channel
directly when creating a thread.
Introduces close semtantics to channels as well, but otherwise
threaded channels behave much like non-threaded channels. They have
different marshalling behavior though, and can only send values over by
packing and unpacking them. For now, this means only primitive values
although this will be expanded.
Also missing some implementation for closing threaded channels, and a
whole lot of testing. Achtung!, Caveat emptor, here be dragons and bugs.
This makes certain algorithms simpler as channels
now have an explicit lifetime - multiple readers can coordinate
closing without needing to ensure the same number of reads as writes.
janet_loop1_interrupt makes the event loop compatible
with safe interruptions for custom scheduling. Does this by exposing
custom events on the event loop. A custom event schedules a function pointer
to run in a way that can interrupt
epoll_wait/poll/GetQueuedCompletionStatus.
This would allow an embedder to suspend the current Janet fiber
via an external event like a signal, other thread, or really anything.
This is a useful primitive for custom schedulers that would call
janet_interpreter_interupt periodically (say, in an interval with SIG_ALRM),
do some work, and then use janet_continue on the janet_root_fiber, or
for embedding into other soft-realtime applications like a game. To say,
only allow about 5ms per frame of interpreter time.
This is mainly meant for use as the entry point to a C wrapper for a
janet program. This maeans the programmer doesn't need to use an ifdef
to handle if the event loop is enabled.
This has a lot of possible uses, and would let users add a macro-based
type system on top of Janet that would integrate with the usual linting
and warning system.
This is the beginning of a system for compiler warnings. This includes
linting, deprecation notices, and other compiler warnings that are best
detected by the `compile` function and don't require the partial
evalutaion of the flychecker.
Structs and tuples composed entirely out of constant values
will themselves be considered constant values during compilation.
This reduces the amount of generated code.
The (undef rule :tag) combinator lets a user "scope" tagged captures.
After the rule has matched, all captures with tag :tag can no longer be
refered to by their tag. However, such captures from outside
rule are kept as is. If no tag is given, all tagged captures from rule
are unreferenced. Note that this doesn't `drop` the captures, merely
removes their association with the tag. This means subsequent calls to
`backref` and `backmatch` will no longer "see" these tagged captures.
This fixes a regression from changes to janet_try. In some cases, we
would not update the status of a fiber when signaling, which left the
fiber's status as whatever it had previously. This could lead to strange
control flow issues.
Just hash upper 32 bits with lower 32 bits. Trying to get too fancy
was causing slowdowns in very trivial cases. Assuming that all
combinations of 64 bits in a double are equally likely (suspect but
probably not that incorrect), the obvious method of xoring the top
32 bits with the lower 32 bits gives a uniform distribution.
When new fibers are scheduled on the event loop, this new_channel
receives the newly created fibers. This lets a fiber track which fibers
have been added and let's a user implement a supervisor.
Fix formatting.
Supervisor channels are a simple concept to more efficiently
enable dynamic, structure concurrency. When a top-level fiber
completes (or errors), it will push itself to it's supervisor
channel if it has one (instead of printing a stacktrace). This
let's another fiber poll a channel and "supervise" a set of fibers.
Keep a separate stack for tagged references. May cause pegs to
use more memory but makes the backref and backmatch features much more
powerful.
Also disables the second stack if backref and backmatch are not used in the peg.
Does not require all sorts of signal handling code
that is not thread-safe and can "steal processes".
However, there is a much simpler way to add this functionality
by creating a new stream and thread for each subprocess when it is
waited on. This is perhaps _slightly_ less efficient but oh so much
simpler, since we can reuse all of our concepts from streams and there
is no need to implement a whole system around the selfpipe.
`seed ^= hash_value(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);`
from https://www.boost.org/doc/libs/1_35_0/doc/html/boost/hash_combine_id241013.html
The current way of combining hashes peforms poorly on hash values of
numbers. Changing the way hashes are combined canlead to a significant speed up:
```
time janet_new -e '(def tbl @{}) (loop [x :in (range 1000) y :in (range 1000)] (put tbl {0 x 1 y} true))'
3.77s user 0.08s system 99% cpu 3.843 total
time janet_orig -e '(def tbl @{}) (loop [x :in (range 1000) y :in (range 1000)] (put tbl {0 x 1 y} true))'
48.98s user 0.15s system 99% cpu 49.136 total
```
This should be more useful than timeouts in real-world
use cases. The deadline system is based on fibers and is target
to much more coarse-grained (and therfor reliable) timeouts than things
like ev/sleep and timeout arguments.
This updates the documentation and adds a function buffer/push, which
is a more useful function than buffer/push-string or buffer/push-byte by
combining both.
These capture the line and column number of the current position
in the matched text. This is useful for error reporting as well
as indentation checking.
This works by lazily creating an index on first use that stores all
newline character indices in order. We can then do a binary search on
this to get both line number and column number in log(n) time.
This is good enough for most use cases and doesn't slow down the common case at all
- these will not be commonly used patterns in a hot loop so it is not worth to try and
optimize this at all. Constant time look up should be possible but at
the cost of complicating code and slowing down all matching to check for
new lines.
Make doc-format respect leading indents, increase the default format
width to better accommodate markdown formatted documentation. We still
need to support single line style doc strings, such as those used
for most c functions which can be a single line of much longer than
80 or 120 characters.
Consecutive whitespace internal to lines is not preserved, though.
Leading spaces are stripped based on the column index of the first
backtick character in the first delimiter. If there are
characters that are not newline or space before that column in the
string, then the behavior is the same as the old behvaior - no
re-indentation is performed.
Before, EPIPE caused an error, but in most cases it is better
to consider it an end of stream. In the future, we may want to allow
cusomtization of this behavior with flags on the stream but for now
let's keep it simpler.
This could have bad effects in higher load situations, and
duplicates code. It is better to keep a dedicated list of
scheduled IO operations which can be efficiently added and
removed from. It also provides and easy way to enumerate
scheduled IO operations.