Minecraft-Console-Client icon indicating copy to clipboard operation
Minecraft-Console-Client copied to clipboard

Bug fix & Performance Optimization

Open BruceChenQAQ opened this issue 3 years ago • 5 comments

Bug fix: Cancel chunk load task when switching worlds

fix #2118

I will record some of the before and after optimization comparisons below.

BruceChenQAQ avatar Aug 24 '22 10:08 BruceChenQAQ

Check Vs WithoutCheck

image

Code
using System.Diagnostics;

const string Format = "{0,7:0.000} ";
const int TotalPasses = 25000;
const int Size = 50;
Stopwatch timer = new();

var functionList = new List<Action> { WithCheck, WithoutCheck };

Console.WriteLine("{0,5}{1,20}{2,20}{3,20}{4,20}", "Run", "Ticks", "ms", "Ticks/Instance", "ms/Instance");

foreach (var item in functionList)
{
    var warmup = Test(item);
    var run = Test(item);

    Console.WriteLine($"{item.Method.Name}:");
    PrintResult("warmup", warmup);
    PrintResult("run", run);
    Console.WriteLine();
}

static void PrintResult(string name, long ticks)
{
    Console.WriteLine("{0,10}{1,20}{2,20}{3,20}{4,20}", name, ticks, string.Format(Format, (decimal)ticks / TimeSpan.TicksPerMillisecond), (decimal)ticks / TotalPasses, (decimal)ticks / TotalPasses / TimeSpan.TicksPerMillisecond);
}

long Test(Action func)
{
    timer.Restart();
    func();
    timer.Stop();
    return timer.ElapsedTicks;
}

static void WithCheck()
{
    ushort[,,] blocks = new ushort[16, 16, 16];

    for (var passes = 0; passes < TotalPasses; passes++)
    {
        for (int chunkY = 0; chunkY < 24; chunkY++)
        {
            for (int blockY = 0; blockY < 16; blockY++)
            {
                for (int blockZ = 0; blockZ < 16; blockZ++)
                {
                    for (int blockX = 0; blockX < 16; blockX++)
                    {
                        if (blockX < 0 || blockX >= 16)
                            throw new ArgumentOutOfRangeException("blockX", "Must be between 0 and " + (16 - 1) + " (inclusive)");
                        if (blockY < 0 || blockY >= 16)
                            throw new ArgumentOutOfRangeException("blockY", "Must be between 0 and " + (16 - 1) + " (inclusive)");
                        if (blockZ < 0 || blockZ >= 16)
                            throw new ArgumentOutOfRangeException("blockZ", "Must be between 0 and " + (16 - 1) + " (inclusive)");

                        blocks[blockX, blockY, blockZ] = (ushort)(blockY * blockZ * blockX);
                    }
                }
            }
        }
    }
}

static void WithoutCheck()
{
    ushort[,,] blocks = new ushort[16, 16, 16];

    for (var passes = 0; passes < TotalPasses; passes++)
    {
        for (int chunkY = 0; chunkY < 24; chunkY++)
        {
            for (int blockY = 0; blockY < 16; blockY++)
            {
                for (int blockZ = 0; blockZ < 16; blockZ++)
                {
                    for (int blockX = 0; blockX < 16; blockX++)
                    {
                        blocks[blockX, blockY, blockZ] = (ushort)(blockY * blockZ * blockX);
                    }
                }
            }
        }
    }
}

BruceChenQAQ avatar Aug 24 '22 13:08 BruceChenQAQ

  • new ReadNextVarInt
public int ReadNextVarInt(Queue<byte> cache)
{
    int i = 0;
    int j = 0;
    byte b;
    do
    {
        b = cache.Dequeue();
        i |= (b & 127) << j++ * 7;
        if (j > 5) throw new OverflowException("VarInt too big");
    } while ((b & 128) == 128);
    return i;
}
2022-08-24_222002
  • old ReadNextVarInt
string rawData = BitConverter.ToString(cache.ToArray());
int i = 0;
int j = 0;
int k = 0;
while (true)
{
    k = ReadNextByte(cache);
    i |= (k & 0x7F) << j++ * 7;
    if (j > 5) throw new OverflowException("VarInt too big " + rawData);
    if ((k & 0x80) != 128) break;
}
return i;
2022-08-24_221809

BruceChenQAQ avatar Aug 24 '22 14:08 BruceChenQAQ

Before: 65 seconds until chunks is fully loaded. Memory usage: 260MiB

2022-08-25_014816

After: 17 seconds until chunks is fully loaded. Memory usage: 138MiB

2022-08-25_013319

BruceChenQAQ avatar Aug 24 '22 17:08 BruceChenQAQ

I removed the locks from Chunk.cs and ChunkColumn.cs and now they should still be thread safe. There may be some oversight, so if you find something that might be wrong, please point it out in the comments. :)

BruceChenQAQ avatar Aug 24 '22 17:08 BruceChenQAQ

9.7 seconds until chunks is fully loaded. Memory usage: 73MiB

image


Before joining a server: image

After joining a server: image

BruceChenQAQ avatar Aug 25 '22 06:08 BruceChenQAQ

5.1 seconds until chunks fully loaded.

image

BruceChenQAQ avatar Aug 27 '22 15:08 BruceChenQAQ

2.8 seconds & 58MiB

Now using the AES-NI instruction set for greater throughput and lower CPU usage than the previous version using BouncyCastle.

On devices that do not support this instruction set (SSE2 and AES-NI), the AES implementation from the standard library will be used and parallel processing will be enabled.

public override int Read(byte[] buffer, int outOffset, int required)
{
    if (inStreamEnded)
        return 0;

    Span<byte> blockOutput = FastAes != null ? stackalloc byte[blockSize] : null;

    byte[] inputBuf = new byte[blockSize + required];
    Array.Copy(ReadStreamIV, inputBuf, blockSize);

    for (int readed = 0, curRead; readed < required; readed += curRead)
    {
        curRead = BaseStream.Read(inputBuf, blockSize + readed, required - readed);
        if (curRead == 0)
        {
            inStreamEnded = true;
            return readed;
        }

        int processEnd = readed + curRead;
        if (FastAes != null)
        {
            for (int idx = readed; idx < processEnd; idx++)
            {
                ReadOnlySpan<byte> blockInput = new(inputBuf, idx, blockSize);
                FastAes.EncryptEcb(blockInput, blockOutput);
                buffer[outOffset + idx] = (byte)(blockOutput[0] ^ inputBuf[idx + blockSize]);
            }
        }
        else
        {
            OrderablePartitioner<Tuple<int, int>> rangePartitioner = curRead <= 256 ?
                Partitioner.Create(readed, processEnd, 32) : Partitioner.Create(readed, processEnd);
            Parallel.ForEach(rangePartitioner, (range, loopState) =>
            {
                Span<byte> blockOutput = stackalloc byte[blockSize];
                for (int idx = range.Item1; idx < range.Item2; idx++)
                {
                    ReadOnlySpan<byte> blockInput = new(inputBuf, idx, blockSize);
                    Aes!.EncryptEcb(blockInput, blockOutput, PaddingMode.None);
                    buffer[outOffset + idx] = (byte)(blockOutput[0] ^ inputBuf[idx + blockSize]);
                }
            });
        }
    }

    Array.Copy(inputBuf, required, ReadStreamIV, 0, blockSize);

    return required;
}

image


Before joining a server: image

After joining a server: image

BruceChenQAQ avatar Aug 28 '22 07:08 BruceChenQAQ

https://stackoverflow.com/questions/468832/why-are-multi-dimensional-arrays-in-net-slower-than-normal-arrays

image

Test Code
using System.Diagnostics;

const string Format = "{0,7:0.000} ";
const int TotalPasses = 25000;
const int Size = 50;
Stopwatch timer = new();

var functionList = new List<Action> { XYZ, XYZ_Flat, YZX, YZX_Flat };

Console.WriteLine("{0,5}{1,20}{2,20}{3,20}{4,20}", "Run", "Ticks", "ms", "Ticks/Instance", "ms/Instance");

foreach (var item in functionList)
{
    var warmup = Test(item);
    var run = Test(item);

    Console.WriteLine($"{item.Method.Name}:");
    PrintResult("warmup", warmup);
    PrintResult("run", run);
    Console.WriteLine();
}

static void PrintResult(string name, long ticks)
{
    Console.WriteLine("{0,10}{1,20}{2,20}{3,20}{4,20}", name, ticks, string.Format(Format, (decimal)ticks / TimeSpan.TicksPerMillisecond), (decimal)ticks / TotalPasses, (decimal)ticks / TotalPasses / TimeSpan.TicksPerMillisecond);
}

long Test(Action func)
{
    timer.Restart();
    func();
    timer.Stop();
    return timer.ElapsedTicks;
}

static void XYZ()
{
    ushort[,,] blocks = new ushort[16, 16, 16];

    for (var passes = 0; passes < TotalPasses; passes++)
    {
        for (int chunkY = 0; chunkY < 24; chunkY++)
        {
            for (int blockY = 0; blockY < 16; blockY++)
            {
                for (int blockZ = 0; blockZ < 16; blockZ++)
                {
                    for (int blockX = 0; blockX < 16; blockX++)
                    {
                        blocks[blockX, blockY, blockZ] = (ushort)(blockY * blockZ * blockX);
                    }
                }
            }
        }
    }
}

static void YZX()
{
    ushort[,,] blocks = new ushort[16, 16, 16];

    for (var passes = 0; passes < TotalPasses; passes++)
    {
        for (int chunkY = 0; chunkY < 24; chunkY++)
        {
            for (int blockY = 0; blockY < 16; blockY++)
            {
                for (int blockZ = 0; blockZ < 16; blockZ++)
                {
                    for (int blockX = 0; blockX < 16; blockX++)
                    {
                        blocks[blockY, blockZ, blockX] = (ushort)(blockY * blockZ * blockX);
                    }
                }
            }
        }
    }
}

static void XYZ_Flat()
{
    ushort[] blocks = new ushort[16 * 16 * 16];

    for (var passes = 0; passes < TotalPasses; passes++)
    {
        for (int chunkY = 0; chunkY < 24; chunkY++)
        {
            for (int blockY = 0; blockY < 16; blockY++)
            {
                for (int blockZ = 0; blockZ < 16; blockZ++)
                {
                    for (int blockX = 0; blockX < 16; blockX++)
                    {
                        blocks[(blockX << 8) | (blockY << 4) | blockZ] = (ushort)(blockY * blockZ * blockX);
                    }
                }
            }
        }
    }
}

static void YZX_Flat()
{
    ushort[] blocks = new ushort[16 * 16 * 16];

    for (var passes = 0; passes < TotalPasses; passes++)
    {
        for (int chunkY = 0; chunkY < 24; chunkY++)
        {
            for (int blockY = 0; blockY < 16; blockY++)
            {
                for (int blockZ = 0; blockZ < 16; blockZ++)
                {
                    for (int blockX = 0; blockX < 16; blockX++)
                    {
                        blocks[(blockY << 8) | (blockZ << 4) | blockX] = (ushort)(blockY * blockZ * blockX);
                    }
                }
            }
        }
    }
}


After joining a server: image

BruceChenQAQ avatar Aug 30 '22 04:08 BruceChenQAQ

image 111 + 172 + 15 + 490 + 131 + 428 = 1347ms 111 + 172 + 15 + 456 + 490 + 131 + 428 = 1803ms


Initialization stage(Before join server) image

Joining a server and chunk loading image

Chunk loading completed, traffic reduced image

image

BruceChenQAQ avatar Aug 30 '22 15:08 BruceChenQAQ

Great job! The performance gain is fascinating :o

ORelio avatar Aug 31 '22 19:08 ORelio

Can you rename the protocolversion to protocolVersion so it matches the master, so there is no conflicts?

milutinke avatar Sep 01 '22 22:09 milutinke

Okay, renamed.

BruceChenQAQ avatar Sep 02 '22 01:09 BruceChenQAQ

@BruceChenQAQ Thanks for your work on this. You seems to coordinate well with @milutinke so I'm adding you as well as maintainer so you can work more efficiently on the code and merge this PR when appropriate 👍

ORelio avatar Sep 03 '22 13:09 ORelio

Pull from the master the web UI can't work with this much conflics, I'll do testing again on multiple servers, especially with terrain handling. Btw, awesome job, this is going to improve MCC so much.

milutinke avatar Sep 03 '22 19:09 milutinke

Conflict resolution has been completed, I'll do some testing later. ~There are still some issue found with the handling of MessageHeader in 1.19.2~ Edit: Fixed.

BruceChenQAQ avatar Sep 04 '22 03:09 BruceChenQAQ

I tested terrain handling in an official server and a paper server respectively, and they worked fine.

@milutinke Let's merge it if all functions are working properly.

BruceChenQAQ avatar Sep 04 '22 11:09 BruceChenQAQ

Is this Console Interactive bug? image

Edit: The only remaining things are the bug above and not being able to path-find when in 2x1 tunnel. Everything else works.

milutinke avatar Sep 04 '22 17:09 milutinke

Is this Console Interactive bug?

Partially yes, this should be fixed in https://github.com/MCCTeam/Minecraft-Console-Client/pull/2124/commits/b26949e483a289934a40f2a65aa34b53988351bb and https://github.com/breadbyte/ConsoleInteractive/pull/14 and https://github.com/breadbyte/ConsoleInteractive/pull/12.


Edit: The only remaining things are the bug above and not being able to path-find when in 2x1 tunnel. Everything else works.

I tested that it is pathfinding in the tunnel, could it be that the coordinates are wrong? The coordinates of the Block row should be used, or the coordinates of the XYZ row should be rounded down (-42.1 -> -43, +43.9 -> +43) Maybe it should be reminded in the documentation.

image

BruceChenQAQ avatar Sep 05 '22 01:09 BruceChenQAQ

Is this Console Interactive bug?

Partially yes, this should be fixed in b26949e and breadbyte/ConsoleInteractive#14 and breadbyte/ConsoleInteractive#12.

Edit: The only remaining things are the bug above and not being able to path-find when in 2x1 tunnel. Everything else works.

I tested that it is pathfinding in the tunnel, could it be that the coordinates are wrong? The coordinates of the Block row should be used, or the coordinates of the XYZ row should be rounded down (-42.1 -> -43, +43.9 -> +43) Maybe it should be reminded in the documentation.

image

image

milutinke avatar Sep 05 '22 11:09 milutinke

@milutinke Try using /move 35 71 -35

BruceChenQAQ avatar Sep 05 '22 11:09 BruceChenQAQ

/move 35 71 -35

That works. It's not really intuitive.

milutinke avatar Sep 05 '22 11:09 milutinke

/move 35 71 -35

That works. It's not really intuitive.

Yes, I also encountered this problem when I started testing. Because people are used to rounding to zero, rather than strictly rounding down.

BruceChenQAQ avatar Sep 05 '22 11:09 BruceChenQAQ

Perhaps it would be better to upgrade the move command to accept floating point and relative coordinate representations.

BruceChenQAQ avatar Sep 05 '22 11:09 BruceChenQAQ

Perhaps it would be better to upgrade the move command to accept floating point and relative coordinate representations.

The message appeared after the merge xD That could be done in a separate PR.

milutinke avatar Sep 05 '22 11:09 milutinke

Good job 🎉

ORelio avatar Sep 05 '22 11:09 ORelio

Updated the ConsoleInteractive submodule as well, as it wasn't included in this PR (BruceChenQAQ added in fixes too)

[The recent build is failing because there seems to be a network issue with the github action runner, will try again later.]

breadbyte avatar Sep 05 '22 12:09 breadbyte

The message appeared after the merge xD That could be done in a separate PR.

Implemented in #2154, it will now move to the exact coordinates entered. :)

BruceChenQAQ avatar Sep 05 '22 14:09 BruceChenQAQ