- Instead of setting soft/hard timeouts on the ILuaMachine, we instead
provide it with a TimeoutState instance. This holds the current abort
flags, which can then be polled within debug hooks.
This means the Lua machine has to do less state management, but also
allows a more flexible implementation of aborts.
- Soft aborts are now handled by the TimeoutState - we track when the
task was started, and now only need to check we're more than 7s since
then.
Note, these timers work with millisecond granularity, rather than
nano, as this invokes substantially less overhead.
- Instead of having n runners being observed with n managers, we now
have n runners and 1 manager (or Monitor).
The runners are now responsible for pulling work from the queue. When
the start to execute a task, they set the time execution commenced.
The monitor then just checks each runner every 0.1s and handles hard
aborts (or killing the thread if need be).
- Rename unload -> close to be a little more consistent
- Make pollAndResetChanged be atomic, so we don't need to aquire a lock
- Get the computer queue from the task owner, rather than a separate
argument.
The latest version of Cobalt has several major changes, which I'm
looking forward to taking advantage of in the coming months:
- The Lua interpreter has been split up from the actual LuaClosure
instance. It now runs multiple functions within one loop, handling
pushing/popping and resuming method calls correctly.
This means we have a theoretically infinite call depth, as we're no
longer bounded by Java's stack size. In reality, this is limited to
32767 (Short.MAX_VALUE), as that's a mostly equivalent to the limits
PUC Lua exposes.
- The stack is no longer unwound in the event of errors. This both
simplifies error handling (not that CC:T needs to care about that)
but also means one can call debug.traceback on a now-dead coroutine
(which is more useful for debugging than using xpcall).
- Most significantly, coroutines are no longer each run on a dedicated
thread. Instead, yielding or resuming throws an exception to unwind
the Java stack and switches to a different coroutine.
In order to preserve compatability with CC's assumption about LuaJ's
threading model (namely that yielding blocks the thread), we also
provide a yieldBlock method (which CC:T consumes). This suspends the
current thread and switches execution to a new thread (see
SquidDev/Cobalt@b5ddf164f1 for more
details). While this does mean we need to use more than 1 thread,
it's still /substantially/ less than would otherwise be needed.
We've been running these changes on SwitchCraft for a few days now and
haven't seen any issues. One nice thing to observe is that the number of
CC thread has gone down from ~1.9k to ~100 (of those, ~70 are dedicated
to running coroutines). Similarly, the server has gone from generating
~15k threads over its lifetime, to ~3k. While this is still a lot, it's
a substantial improvement.
- Remove a redundant logger
- Provide a getter for the ComputerCraft thread group. This allows us
to monitor child threads within prometheus.
- Replace a deprecated call with a fastutils alternative.
- Keep track of the number of created and destroyed coroutines for each
computer.
- Run coroutines with a thread pool executor, which will keep stale
threads around for 60 seconds. This substantially reduces the
pressure from short-lived coroutines.
- Update to the latest Cobalt version.
Whilst I'm pretty sure this is safe for general use, I'm disabling this
by default for now. I may consider enabling it in the future if no
issues are found.