Like #455, this sets our uniforms via a UBO rather than having separate
ones for each value. There are a couple of small differences:
- Have a UBO for each monitor, rather than sharing one and rewriting it
every monitor. This means we only need to update the buffer when the
monitor changes.
- Use std140 rather than the default layout. This means we don't have
to care about location/stride in the buffer.
Also like #455, this doesn't actually seem to result in any performance
improvements for me. However, it does make it a bit easier to handle a
large number of uniforms.
Also cleans up the generation of the main monitor texture buffer:
- Move buffer generation into a separate method - just ensures that it
shows up separately in profilers.
- Explicitly pass the position when setting bytes, rather than
incrementing the internal one. This saves some memory reads/writes (I
thought Java optimised them out, evidently not!). Saves a few fps
when updating.
- Use DSA when possible. Unclear if it helps at all, but nice to do :).
This takes a non-trivial amount of time on the render thread[^1], so
worth doing.
I don't actually think the allocation is the heavy thing here -
VisualVM says it's toWorldPos being slow. I'm not sure why - possibly
just all the block property lookups? [^2]
[^1]: To be clear, this is with 120 monitors and no other block entities
with custom renderers. so not really representative.
[^2]: I wish I could provide a narrower range, but it varies so much
between me restarting the game. Makes it impossible to benchmark
anything!
The VBO renderer needs to generate a buffer with two quads for each
cell, and then transfer it to the GPU. For large monitors, generating
this buffer can get quite slow. Most of the issues come from
IVertexBuilder (VertexConsumer under MojMap) having a lot of overhead.
By emitting a ByteBuffer directly (and doing so with Unsafe to avoid
bounds checks), we can improve performance 10 fold, going from
3fps/300ms for 120 monitors to 111fps/9ms.
See 41fa95bce4 and #1065 for some more
context and other exploratory work. The key thing to note is we _need_ a
separate version of FWFR for emitting to a ByteBuffer, as introducing
polymorphism to it comes with a significant performance hit.
- Move all RenderType instances into a common class.
Cherry-picked from 41fa95bce4:
- Render GL_QUADS instead of GL_TRIANGLES.
- Remove any "immediate mode" methods from FWFR. Most use-cases can be
replaced with the global MultiBufferSource and a proper RenderType
(which we weren't using correctly before!).
Only the GUI code (WidgetTerminal) needs to use the immediate mode.
- Pre-convert palette colours to bytes, storing both the coloured and
greyscale versions as a byte array.
Cherry-picked from 3eb601e554:
- Pass lightmap variables around the various renderers. Fixes#919 for
1.16!
"Instead, it is a standard program, which its API into the programs that it launches."
becomes
"Instead, it is a standard program, which injects its API into the programs that it launches."
A little shorter and more explicit than constructing the Vector3d
manually. Fixes an issue where sounds were centered on the bottom left
of speakers, not the middle (see cc-tweaked/cc-restitched#85).
See #1061, closes#1064.
Nobody ever seems to implement this correctly (though it's better than
1.12, at least we've not seen any crashes), and this isn't a fight I
care enough about fighting any more.
There's a couple of alternative ways to solve this. Ideally we'd send
our network messages at the same time as MC does
(ChunkManager.playerLoadedChunk), but this'd require a mixin.
Instead we just rely on the fact that if the chunk isn't loaded,
monitors won't have done anything and so we don't need to send their
contents!
Fixes#1047, probably doesn't cause any regressions. I've not seen any
issues on 1.16, but I also hadn't before so ¯\_(ツ)_/¯.
This was added in the 1.13 update and I'm still not sure why. Other mods
seem to get away without it, so I think it's fine to remove.
Also remove the fake net manager, as that's part of Forge nowadays.
Fixes#1044.
- Fixes#1026
- The remaining bytes counter wasn't being decremented, so the code that
splits off smaller packets was unreachable. Thus all file slices were
being put into a single UploadFileMessage packet.
- Fix UpgradeSpeakerPeripheral not calling super.detach (so old
computers were never cleaned up)
- Correctly lock computer accesses inside SpeakerPeripheral
Fixes#1003.
Fingers crossed this is the last bug. Then I can bump the year and push
a new release tomorrow.
We're still a few days away from release, but don't think anything else
is going to change. And I /really/ don't want to have to write this
changelog (and then merge into later versions) on the 25th.
While Minecraft will automatically push a new buffer when one is
exhausted, this doesn't help if there's only a single buffer in the
queue, and you end up with stutter.
By enquing a buffer when receiving sound we ensure there's always
something queued. I'm not 100% happy with this solution, but it does
alleviate some of the concerns in #993.
Also reduce the size of the client buffer to 0.5s from 1.5s. This is
still enough to ensure seamless audio when the server is running slow (I
tested at 13 tps, but should be able to go much worse).
When the game is paused in SSP world, speakers are not ticked. However,
System.nanoTime() continues to increase, which means the next tick
speakers believe there has been a big jump and so schedule a bunch of
extra audio.
To avoid this, we keep track of how long the game has been paused offset
nanoTime by that amount.
Fixes#994