Unicolour icon indicating copy to clipboard operation
Unicolour copied to clipboard

Slow to get RGB

Open otomad opened this issue 1 year ago â€Ē 1 comments

I am creating a color picker with Unicolour and WriteableBitmap in WPF. Here is what I want to implement: image

When I creating the left plane image, I found that it is too slow to get the RGB colors.

int width = writeableBitmap.PixelWidth, height = writeableBitmap.PixelHeight;
unsafe {
    byte* pixels = (byte*)writeableBitmap.BackBuffer.ToPointer();
    writeableBitmap.Lock();
    Stopwatch sw = new();
    for (int row = 0; row < height; row++) {
	    for (int col = 0; col < width; col++) {
            int i = row * writeableBitmap.BackBufferStride + col * 3;
            // get the values of the color...
            Unicolour color = new(colorSpace, tuple);
            sw.Start();
            Rgb255 rgb = color.Rgb.Byte255;
            sw.Stop();
            pixels[i] = (byte)rgb.R;
            pixels[i + 1] = (byte)rgb.G;
            pixels[i + 2] = (byte)rgb.B;
        }
    }
    Debug.WriteLine(sw.Elapsed.TotalMilliseconds);
    writeableBitmap.AddDirtyRect(new Int32Rect(0, 0, width, height));
    writeableBitmap.Unlock();
}

I used a 128×128 image and the stopwatch showed that it took about 1143 milliseconds to convert the colors to RGB. This makes it a great deal for users to experience a lot of stuttering when using the color picker. Is there a way to fix this?

If the current color space is RGB/RGB255, the time will reduce to about 500 milliseconds, but it is still slow.

The right slider image is 1×128, and it doesn't feel stuttering.

otomad avatar Oct 03 '24 09:10 otomad

Hmm, although I make a point in the readme of saying that "performance is not a priority", the performance you're describing is much worse than I'd expect and doesn't make sense to me.

In particular, it sounds like you're suggesting that it takes 500 ms to get Rgb values after constructing a Unicolour with Rgb values, which is bizarre - at the most there are 6 simple calculations (mapping R, G, B to and from the 255-range).

Are you able to isolate the performance issue any further? For example, do you see the same slowness when not using WPF, or when using a different runtime? Release mode instead of debug mode? The only other thing I can think of is that, if the operations are running on the UI thread, the UI thread might be using some of those 500ms to render the UI, handle user input, etc?

For my own curiosity and reassurance I've added some benchmarking code to a separate branch using BenchmarkDotNet - maybe you could reproduce some conversions you're doing based on this example and see if it's still so slow outside of your WPF context?

It's not a controlled experiment but on my laptop I ran benchmarks for converting from Rgb to every other colour space on both .NET 8.0 and .NET Framework 4.7.2 environments and I don't see anything surprising:

  • It is quick to return Rgb (1.5 - 4.5 Ξs) because there is no transformation to do from Rgb
  • It is slower to return spaces that require multiple complex transformations, such as Cam02 (6.7 - 14.4 Ξs), Cam16 (5.3 - 12.5 Ξs), Hct (5.6 - 13.4 Ξs)
  • It is very slow to calculate Wxy (108.2 - 135.4 Ξs) because that involves an intensive search algorithm
  • .NET Framework 4.7.2 is slower than .NET 8.0

Note that these metrics include the Unicolour construction, not just retrieving the converted value as your stopwatch example used.

In your particular case, for a 128 x 128 image, on my machine using the slower .NET Framework, I would expect the 16,384 conversions to take ~74,000 Ξs, or 74 ms - a lot less concerning that 500 ms!

For reference, here are the full results on my machine


BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4169/23H2/2023Update/SunValley3)
13th Gen Intel Core i7-13700H, 1 CPU, 20 logical and 14 physical cores
  [Host]               : .NET Framework 4.8.1 (4.8.9261.0), X64 RyuJIT VectorSize=256
  .NET 8.0             : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2
  .NET Framework 4.7.2 : .NET Framework 4.8.1 (4.8.9261.0), X64 RyuJIT VectorSize=256


Method Job Runtime TargetColourSpace Mean Error StdDev
Convert .NET 8.0 .NET 8.0 Rgb 1.503 Ξs 0.0145 Ξs 0.0129 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Rgb 4.480 Ξs 0.0363 Ξs 0.0339 Ξs
Convert .NET 8.0 .NET 8.0 Rgb255 1.581 Ξs 0.0099 Ξs 0.0088 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Rgb255 4.565 Ξs 0.0459 Ξs 0.0430 Ξs
Convert .NET 8.0 .NET 8.0 RgbLinear 1.967 Ξs 0.0242 Ξs 0.0202 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 RgbLinear 5.166 Ξs 0.0736 Ξs 0.0615 Ξs
Convert .NET 8.0 .NET 8.0 Hsb 1.687 Ξs 0.0218 Ξs 0.0182 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hsb 4.980 Ξs 0.0470 Ξs 0.0367 Ξs
Convert .NET 8.0 .NET 8.0 Hsl 1.772 Ξs 0.0219 Ξs 0.0171 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hsl 5.166 Ξs 0.0390 Ξs 0.0365 Ξs
Convert .NET 8.0 .NET 8.0 Hwb 1.771 Ξs 0.0139 Ξs 0.0130 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hwb 5.103 Ξs 0.0481 Ξs 0.0450 Ξs
Convert .NET 8.0 .NET 8.0 Hsi 1.677 Ξs 0.0223 Ξs 0.0209 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hsi 4.880 Ξs 0.0483 Ξs 0.0452 Ξs
Convert .NET 8.0 .NET 8.0 Xyz 2.192 Ξs 0.0418 Ξs 0.0481 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Xyz 5.673 Ξs 0.0672 Ξs 0.0628 Ξs
Convert .NET 8.0 .NET 8.0 Xyy 2.324 Ξs 0.0379 Ξs 0.0355 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Xyy 5.998 Ξs 0.0350 Ξs 0.0327 Ξs
Convert .NET 8.0 .NET 8.0 Wxy 108.223 Ξs 1.5660 Ξs 1.3077 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Wxy 135.388 Ξs 1.5441 Ξs 1.3688 Ξs
Convert .NET 8.0 .NET 8.0 Lab 2.535 Ξs 0.0486 Ξs 0.0632 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Lab 6.320 Ξs 0.1140 Ξs 0.1066 Ξs
Convert .NET 8.0 .NET 8.0 Lchab 2.762 Ξs 0.0305 Ξs 0.0270 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Lchab 6.745 Ξs 0.0749 Ξs 0.0701 Ξs
Convert .NET 8.0 .NET 8.0 Luv 2.384 Ξs 0.0211 Ξs 0.0176 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Luv 6.143 Ξs 0.0943 Ξs 0.0836 Ξs
Convert .NET 8.0 .NET 8.0 Lchuv 2.698 Ξs 0.0487 Ξs 0.0432 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Lchuv 6.523 Ξs 0.0771 Ξs 0.0721 Ξs
Convert .NET 8.0 .NET 8.0 Hsluv 3.395 Ξs 0.0559 Ξs 0.0523 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hsluv 8.472 Ξs 0.0347 Ξs 0.0290 Ξs
Convert .NET 8.0 .NET 8.0 Hpluv 3.530 Ξs 0.0421 Ξs 0.0373 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hpluv 8.027 Ξs 0.0710 Ξs 0.0629 Ξs
Convert .NET 8.0 .NET 8.0 Ypbpr 1.628 Ξs 0.0207 Ξs 0.0194 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Ypbpr 4.661 Ξs 0.0758 Ξs 0.0709 Ξs
Convert .NET 8.0 .NET 8.0 Ycbcr 1.728 Ξs 0.0149 Ξs 0.0139 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Ycbcr 4.866 Ξs 0.0257 Ξs 0.0240 Ξs
Convert .NET 8.0 .NET 8.0 Ycgco 1.637 Ξs 0.0126 Ξs 0.0118 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Ycgco 4.711 Ξs 0.0715 Ξs 0.0669 Ξs
Convert .NET 8.0 .NET 8.0 Yuv 1.692 Ξs 0.0333 Ξs 0.0547 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Yuv 4.654 Ξs 0.0503 Ξs 0.0470 Ξs
Convert .NET 8.0 .NET 8.0 Yiq 1.901 Ξs 0.0195 Ξs 0.0182 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Yiq 5.177 Ξs 0.0477 Ξs 0.0446 Ξs
Convert .NET 8.0 .NET 8.0 Ydbdr 1.720 Ξs 0.0186 Ξs 0.0174 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Ydbdr 5.020 Ξs 0.0500 Ξs 0.0468 Ξs
Convert .NET 8.0 .NET 8.0 Tsl 1.825 Ξs 0.0351 Ξs 0.0390 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Tsl 4.814 Ξs 0.0458 Ξs 0.0429 Ξs
Convert .NET 8.0 .NET 8.0 Xyb 2.426 Ξs 0.0314 Ξs 0.0293 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Xyb 6.344 Ξs 0.0319 Ξs 0.0299 Ξs
Convert .NET 8.0 .NET 8.0 Ipt 2.703 Ξs 0.0401 Ξs 0.0356 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Ipt 6.815 Ξs 0.0670 Ξs 0.0594 Ξs
Convert .NET 8.0 .NET 8.0 Ictcp 3.093 Ξs 0.0499 Ξs 0.0466 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Ictcp 7.059 Ξs 0.0517 Ξs 0.0483 Ξs
Convert .NET 8.0 .NET 8.0 Jzazbz 3.148 Ξs 0.0266 Ξs 0.0249 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Jzazbz 7.448 Ξs 0.0416 Ξs 0.0368 Ξs
Convert .NET 8.0 .NET 8.0 Jzczhz 3.346 Ξs 0.0265 Ξs 0.0235 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Jzczhz 8.020 Ξs 0.1405 Ξs 0.1314 Ξs
Convert .NET 8.0 .NET 8.0 Oklab 3.082 Ξs 0.0290 Ξs 0.0271 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Oklab 7.177 Ξs 0.0522 Ξs 0.0488 Ξs
Convert .NET 8.0 .NET 8.0 Oklch 3.274 Ξs 0.0278 Ξs 0.0260 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Oklch 7.764 Ξs 0.1350 Ξs 0.1263 Ξs
Convert .NET 8.0 .NET 8.0 Okhsv 6.717 Ξs 0.1184 Ξs 0.1108 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Okhsv 13.080 Ξs 0.1087 Ξs 0.1016 Ξs
Convert .NET 8.0 .NET 8.0 Okhsl 5.525 Ξs 0.1100 Ξs 0.1578 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Okhsl 10.661 Ξs 0.0625 Ξs 0.0584 Ξs
Convert .NET 8.0 .NET 8.0 Okhwb 6.670 Ξs 0.0391 Ξs 0.0366 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Okhwb 13.323 Ξs 0.1095 Ξs 0.1024 Ξs
Convert .NET 8.0 .NET 8.0 Cam02 6.682 Ξs 0.0636 Ξs 0.0531 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Cam02 14.399 Ξs 0.0815 Ξs 0.0762 Ξs
Convert .NET 8.0 .NET 8.0 Cam16 5.275 Ξs 0.0656 Ξs 0.0613 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Cam16 12.523 Ξs 0.1659 Ξs 0.1552 Ξs
Convert .NET 8.0 .NET 8.0 Hct 5.627 Ξs 0.0464 Ξs 0.0362 Ξs
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 Hct 13.380 Ξs 0.2327 Ξs 0.2177 Ξs

waacton avatar Oct 04 '24 13:10 waacton

I am using .NET Framework 4.8. Now I use the following code directly in the Main() function in the WPF/WinForm Program.cs, and remove any UI code in it:

internal static class Program {
	public static void Main() {
		Stopwatch sw = new Stopwatch();
		byte[] bytes = new byte[128 * 128 * 3];
		for (int i = 0; i < 128; i++) {
			for (int j = 0; j < 128; j++) {
				int k = i * 128 * 3 + j * 3;
				Unicolour color = new Unicolour(ColourSpace.Rgb255, 255, i / 128 * 255, j / 128 * 255);
				sw.Start();
				Rgb255 rgb = color.Rgb.Byte255;
				sw.Stop();
				bytes[k] = (byte)rgb.R;
				bytes[k + 1] = (byte)rgb.G;
				bytes[k + 2] = (byte)rgb.B;
			}
		}
		Debug.WriteLine(sw.ElapsedMilliseconds);
	}
}

As you can see, this will create the Unicolour object and convert it to RGB $$128 \times 128 = 16384$$ times.

The tests are as follows:

Config Time
Debug x64 289ms
Release x64 200ms

Now I create a new .NET Framework 4.8 Console Application Project with same code (just replace Debug.WriteLine to Console.WriteLine). The tests are as follows:

Config Time
Debug Any CPU 330ms
Release Any CPU 313ms
Debug x64 213ms
Release x64 214ms

It still seems to > 72ms.

Windows 11 (10.0.22631.4169)
12th Gen Intel Core i5-12600KF, 1 CPU, 16 logical and 10 physical cores

otomad avatar Oct 05 '24 15:10 otomad

I've been investigating with some profiling tools and I can't find anything concerning around reading the Rgb255 values in this way, but I do see significant slowdown when I attach a debugger. My results aren't in the same range as yours but it consistently shows that:

  • the code executes quickly when just "run"
  • the code executes much more slowly when debugging instead of running

And this seems to be regardless of debug mode vs release mode.

Using .NET Framework 4.8, debug mode, Any CPU, reading 16,384 Rgb255 values using your exact code, on my laptop:

Project Type Execution Mode Duration
WinForms Run 8 ms
WinForms Debugging 1,534 ms
WPF Run 11 ms
WPF Debugging 1,659 ms
Console Run 11 ms
Console Debugging 1,478 ms

I'm hoping then that it's just a case of a debugger being attached but I'm not familiar with how much overhead a debugger actually comes with.

â„đïļ However! â„đïļ

While testing and profiling, I found that the Unicolour constructor was taking a fair bit longer relative to other functions, and I think there's an easy change that could halve the time it takes to construct. For my laptop that means ~0.002 ms per construction but it adds up.

For example, running (not debugging!) this code...

var stopwatch = Stopwatch.StartNew();
for (var i = 0; i < count; i++) 
{
    var colour = new Unicolour(ColourSpace.Rgb255, 255, i / 128.0 * 255, i / 128.0 * 255);
    var rgb = colour.Rgb.Byte255;
}

Console.WriteLine(stopwatch.ElapsedMilliseconds);

... on my laptop results in:

Count Changed? Duration
16,384 No 146 ms
16,384 Yes 103 ms
1,000,000 No 4,188 ms
1,000,000 Yes 1,731 ms

Again, it's not a controlled experiment, but it's quite the difference.


In summary

  1. I can't reproduce any performance issue in the reading of Rgb255, either with benchmarking or profiling, except when using a debugger - at which point everything slows down a lot (I'm using JetBrains Rider if that's of any interest)
  2. There's definitely an opportunity to speed up the Unicolour constructor. I want to reiterate that my primary focus with Unicolour is correctness, not speed, but I'm happy to make performance improvements if they don't require huge changes! It'll take me a bit of time to make the update (I'm a little busier than usual at the moment!) though I don't think the change will make a big difference in your case

waacton avatar Oct 06 '24 16:10 waacton

Thank you for your test, and look forward to your optimization and improvement of its performance.

At present, I am temporarily solving it by reducing the image size to 32×32, then stretching the image. This will cause some color gradient transition areas to produce anomalies, but at least it can speed up the process.

otomad avatar Oct 06 '24 19:10 otomad

@otomad I'd be interested to know if running your WPF app without a debugger still has the same performance issues, because everything I've seen indicates it should be OK. I'm happy to take a look at the app itself if you're willing to invite me to a repo.

One extra thing that I'd forgotten to mention: the first time you construct a Unicolour will take quite a bit longer than the rest (perhaps ~50 ms). Behind the scenes it's caching a lot of data needed for some of the more complex calculations. As long as you continue to use the same Configuration instance (which you will if you don't override it), subsequent Unicolour constructions will be quicker. If you trigger the instantiation of Configuration.Default before you try to use your first Unicolour, you should also see some improvement.

For example:

var stopwatch = Stopwatch.StartNew();
for (var i = 0; i < 16384; i++) 
{
    var colour = new Unicolour(ColourSpace.Rgb255, 255, 255, 255);
    var rgb = colour.Rgb.Byte255;
}

Console.WriteLine(stopwatch.ElapsedMilliseconds);
Code Duration
Current 143 ms
With constructor change 106 ms
_ = Configuration.Default; // 👈 only needed once, makes first Unicolour initialise faster

var stopwatch = Stopwatch.StartNew();
for (var i = 0; i < 16384; i++) 
{
    var colour = new Unicolour(ColourSpace.Rgb255, 255, 255, 255);
    var rgb = colour.Rgb.Byte255;
}

Console.WriteLine(stopwatch.ElapsedMilliseconds);
Code Duration
Current 75 ms
With constructor change 34 ms

(This is because new Unicolour(ColourSpace.Rgb255, r, g, b) is equivalent to new Unicolour(Configuration.Default, ColourSpace.Rgb255, r, g, b)

Hopefully that'll help a bit too.

waacton avatar Oct 07 '24 10:10 waacton

Constructor performance improvements have been merged in https://github.com/waacton/Unicolour/commit/dc70ed58f69c0910fab9a79f3591b204658ea1b7 and is available in version 4.7.0 🙂

With this change the constructor won't unintentionally trigger any calculations. And for further improvements you can do the things I mentioned above:

  1. Initialise Configuration early, before creating Unicolour objects
  2. Use run instead of debug when looking at performance.

waacton avatar Nov 01 '24 12:11 waacton

As a follow up to this, I've reworked how a Unicolour is constructed in 6.0.0, and some naive benchmark suggests it's 95% faster.

Results for reference


BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3775)
13th Gen Intel Core i7-13700H, 1 CPU, 20 logical and 14 physical cores
  [Host]               : .NET Framework 4.8.1 (4.8.9300.0), X64 RyuJIT VectorSize=256
  .NET 8.0             : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2
  .NET Framework 4.7.2 : .NET Framework 4.8.1 (4.8.9300.0), X64 RyuJIT VectorSize=256


Method Job Runtime Mean Error StdDev
Construct .NET 8.0 .NET 8.0 37.79 ns 0.334 ns 0.296 ns
Convert .NET 8.0 .NET 8.0 52.92 ns 1.059 ns 0.991 ns
Construct .NET Framework 4.7.2 .NET Framework 4.7.2 45.27 ns 0.656 ns 0.614 ns
Convert .NET Framework 4.7.2 .NET Framework 4.7.2 61.74 ns 0.489 ns 0.457 ns

waacton avatar Apr 22 '25 22:04 waacton