OpenCOLLADA
OpenCOLLADA copied to clipboard
vertex colors are not exported
They are by the autodesk 3dsMax collada exporter, but not at all by openCOLLADA.
We would strongly prefer option 1. Under option 2, we are forced to change our implementation to calculate values at a higher precision. This leads to loss of performance which defeats the purpose of using half-floats in the first place. When an application chooses to use fp16 , it should also be prepared to handle/accept any underflow/overflow issues from the lower precision level.
FWIW, we currently implement dot
with pure fp16:
https://github.com/intel/intel-graphics-compiler/blob/master/IGC/BiFModule/Implementation/Geometric/dot.cl#L69
We implement cross
though by upconverting to fp32 - not sure why:
https://github.com/intel/intel-graphics-compiler/blob/master/IGC/BiFModule/Implementation/Geometric/cross.cl#L53
I agree it would be nice to implement these functions with fp16 to improve performance.
The one part I don't quite understand just yet is how we're passing the new CTS tests when we implement dot
with fp16. We are implementing this built-in with a sequence of fma
s, rather than multiplies and adds. Could that have something to do with it?
If I get a chance I'll try a pure fp16 cross
also to see what happens.
I checked with a pure fp16 cross
and we fail the new test with this implementation also:
half3 __attribute__((overloadable)) my_cross(half3 p0, half3 p1) {
half3 result;
result.x = fma(p0.y, p1.z, -p0.z * p1.y);
result.y = fma(p0.z, p1.x, -p0.x * p1.z);
result.z = fma(p0.x, p1.y, -p0.y * p1.x);
return result;
}
The failure is (I haven't dug deeper):
ERROR: Data sample 124 does not validate! Expected (inf,0x1.bdcp+15,inf,0x0p+0), got (0x1.bdcp+15,inf,0x0p+0,inf)
Input: (0x1.07p+8 0x1.16cp+7 -0x1.0a4p+8) and (-0x1.e44p+8 0x1.dap+7 0x1.114p+8)
Errors: (nan out of 0x1.b57e58p+7), (inf out of 0x1.5780c6p+9), (nan out of 0x1.5780c6p+9)
ulp -3831.000000
I'll comment on the CTS PR also. It would be great to get this clarified in the spec (and then fix the test to match).
Do we have agreement that it is conformant to implement dot and cross products with fp16 precision.