OpenUSD icon indicating copy to clipboard operation
OpenUSD copied to clipboard

NVIDIA: Check BOM of USDA files and report errors if found

Open roggiezhang-nv opened this issue 6 months ago • 5 comments

Description of Change(s)

When a usda file is UTF-8 encoded with the BOM signature, the code in the UsdaFileFormat's _CanReadImpl that is invoked prior to spinning up the parser will fail to detect it as a valid USDA asset. Re-encoding the file without the BOM signature solves the issue.

Spiff suggested to not support assets with BOM signature, and enhance the error message from the pre-parser code. This PR works for that.

Link to proposal (if applicable)

Fixes Issue(s)

https://github.com/PixarAnimationStudios/OpenUSD/issues/3746

Checklist

roggiezhang-nv avatar Aug 29 '25 07:08 roggiezhang-nv

@nvmkuruc for vis.

roggiezhang-nv avatar Aug 29 '25 07:08 roggiezhang-nv

I'm rather suprised that UTF-16 and UTF-32 are supported. Is this necessary, or can maybe they be deprecated and dropped as well?

spitzak avatar Aug 29 '25 15:08 spitzak

UTF-16 and UTF-32 aren't supported. If those BOM's are detected, the error message recommends conversion to UTF-8. Any suggestions on how to clarify that?

nvmkuruc avatar Aug 29 '25 15:08 nvmkuruc

Filed as internal issue #USD-11395

(This is an automated message. See here for more information.)

jesschimein avatar Sep 02 '25 18:09 jesschimein

@roggiezhang-nv, I think we may need to revise the proposed implementation. It's possible that CheckBOM will now generate runtime errors during asset read that were previously captured in _CanReadImpl. It might be less invasive to make BOM check a warning that gets emitted by _CanReadImpl when the file prefix is the BOM marker + the correct magic cookie.

nvmkuruc avatar Sep 30 '25 21:09 nvmkuruc