draco
draco copied to clipboard
Potential Regression. Calling EncodeMeshToDracoBuffer causes Bus Error.
Morning!
This week someone uploaded a specific gltf that seems to cause a bus error when going through EncodeMeshToDracoBuffer
(using gltf-pipeline -- https://github.com/CesiumGS/gltf-pipeline/issues/608). The specific line is https://github.com/CesiumGS/gltf-pipeline/blob/901c94f360d60382dfbc8612c12130bc4992f10c/lib/compressDracoMeshes.js#L264 .
What's even worse is that listening to such signal (process.on('SIGBUS', ...) ) causes the whole process to hang -- still don't know why.
Here's some minimal js code that reproduces the issue. Remove the process.on('SIGBUS'...) to see the node process crash and show the bus error.
const fsExtra = require("fs-extra");
const processGltf = require("./lib/processGltf");
const gltf = fsExtra.readJsonSync("./broken.gltf");
const options = {
dracoOptions: {
compressionLevel: 10,
},
};
processGltf(gltf, options).then(function (results) {
fsExtra.writeJsonSync("model-draco.gltf", results.gltf);
}, function (error){
console.log(error);
});
// not having process.on makes it crash with the bus error
process.on('SIGBUS', (signal) => {
console.log('here', signal);
})
Running the above script with node --report-signal=SIGBUS --report-on-signal test.js
also makes it hang (and doesn't create a stack trace). Same for cpu profiling, etc -- they all get hanged.
If I use draco 1.3.6, before the flags NODEJS_CATCH_EXIT and NODEJS_CATCH_REJECTION are disabled, I get the error below instead (rather than bus error). Could it be a side-effect of https://github.com/google/draco/issues/629 ?
RangeError: Maximum call stack size exceeded
RangeError: Maximum call stack size exceeded
RangeError: Maximum call stack size exceeded
RangeError: Maximum call stack size exceeded
RangeError: Maximum call stack size exceeded
RangeError: Maximum call stack size exceeded
RangeError: Maximum call stack size exceeded
at deferUnhandledRejectionCheck (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/debuggability.js:50:9)
at Promise._ensurePossibleRejectionHandled (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/debuggability.js:70:5)
at Promise._reject (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/promise.js:694:14)
at Promise._rejectCallback (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/promise.js:509:10)
at doThenable (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/thenables.js:67:17)
at tryConvertToPromise (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/thenables.js:28:20)
at Promise._resolveCallback (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/promise.js:465:24)
at resolve (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/thenables.js:73:17)
at Object.Module.then (/Users/luis/workspace/Git/deteleTarget/node_modules/draco3d/draco_encoder_nodejs.js:39:40258)
at Object.tryCatcher (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/util.js:16:23)
at doThenable (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/thenables.js:63:38)
at tryConvertToPromise (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/thenables.js:28:20)
at Promise._resolveCallback (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/promise.js:465:24)
at resolve (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/thenables.js:73:17)
at Object.Module.then (/Users/luis/workspace/Git/deteleTarget/node_modules/draco3d/draco_encoder_nodejs.js:39:40258)
at Object.tryCatcher (/Users/luis/workspace/Git/deteleTarget/node_modules/bluebird/js/release/util.js:16:23)
It might be worth mentioning that the input model is in fact invalid, because it contains an accessor that contains NaN
values. So one has to expect a certain kind of error at some point (but maybe the exact error handling mechanisms have to be reviewed or changed, to prevent certain kinds of errors or crashes...)
A normal error would be fine and expected. But causing a bus error (since 1.3.6) and the fact that listening to such termination signal (SIGBUS) makes everything hang is worth some concern--regardless of whether the model is perfectly valid or not.
But yeah, as javagl mentioned it does contain NaN. Simply replacing them with 0 makes the problem go away. Any chance we can handle this issue in draco directly?
Related:
- https://github.com/donmccurdy/glTF-Transform/issues/928
On some platforms this error surfaces an "memory access out of bounds" (included here in case others are searching for that term).
Unfortunately my hunch would be that this error occurs before the data ever reaches the Draco codebase, somewhere in the WASM API and/or generated bindings. A nicer handling of the error may be difficult. Tools like glTF Validator can catch the invalid data before processing a glTF file.
Seems similar indeed. For now yeah, gltf Validator seems the best option. Even created https://github.com/KhronosGroup/glTF-Validator/pull/205 specifically for this.
Could also be a side effect or related to https://github.com/google/draco/issues/629 as mentioned above.