speed up texture builin tests
Putting this here (1) as a TODO (2) to get input from others (3) a place to keep some notes
Profiling the texture builtin tests the majority of time is spent in TexelView.fromTexelsAsColors making a texture with random data with a generator and, at a glance, most of the speed is going small typedarray creation (in Chrome) per texel and/or per component and small JS object creation (lots and lots of temps).
There are at least 2 issues
-
The code takes the u32 hash, divides by 0xFFFFFFFF to get a number from 0.0 to 1.0 and uses that to multiply by the test range for the texture format (0 to 1 for a unorm, -1 to 1 for snorm, 0 to 65535 for 16uint, etc...). It then passes the result to
quantizewhich goes through a kind of deep path of temporaries of creating a texel of the given format in binary and then pulling it back out. So for example a 2bit alpha value should be quantized for 4 values.const quantize = (texel: PerTexelComponent<number>, rep: TexelRepresentationInfo) => { return rep.bitsToNumber(rep.unpackBits(new Uint8Array(rep.pack(rep.encode(texel))))); };The reason the quantization is needed is because the software renderer references the TexelViews so it needs to see the same values as the GPU will see.
-
It calls whatever code is in
fromTexelsAsColorsto convert the texel to the binary format.
These are both slow.
Some ideas for speeding it up.
I tried optimizing the quantization by adding custom, per format, quantizers. That's easy for snorm/unorm/uint/sint formats. That made it 40% faster. (1000->600) That still leaves it slow for the remaining formats.
Commenting out the quantizing step takes it from (1000->440)
That effectively leaves the rest to fromTexelAsColors
Ideas:
-
For snorm/unorm/sint/uint formats we can just use random binary data. All values are valid so there's no reason to do the quantization. We can use
TexelView.fromTextureDataByReferenceto make the TexelViews for the software renderer. -
For formats that need quantization (if any) it might be faster to just put the values in and then read them back from the GPU (f16, ufloat, ...)?