NVIDIA: Check BOM of USDA files and report errors if found
Description of Change(s)
When a usda file is UTF-8 encoded with the BOM signature, the code in the UsdaFileFormat's _CanReadImpl that is invoked prior to spinning up the parser will fail to detect it as a valid USDA asset. Re-encoding the file without the BOM signature solves the issue.
Spiff suggested to not support assets with BOM signature, and enhance the error message from the pre-parser code. This PR works for that.
Link to proposal (if applicable)
Fixes Issue(s)
https://github.com/PixarAnimationStudios/OpenUSD/issues/3746
Checklist
-
[x] I have created this PR based on the dev branch
-
[x] I have followed the coding conventions
-
[x] I have added unit tests that exercise this functionality (Reference: testing guidelines)
-
[x] I have verified that all unit tests pass with the proposed changes
-
[x] I have submitted a signed Contributor License Agreement (Reference: Contributor License Agreement instructions)
@nvmkuruc for vis.
I'm rather suprised that UTF-16 and UTF-32 are supported. Is this necessary, or can maybe they be deprecated and dropped as well?
UTF-16 and UTF-32 aren't supported. If those BOM's are detected, the error message recommends conversion to UTF-8. Any suggestions on how to clarify that?
@roggiezhang-nv, I think we may need to revise the proposed implementation. It's possible that CheckBOM will now generate runtime errors during asset read that were previously captured in _CanReadImpl. It might be less invasive to make BOM check a warning that gets emitted by _CanReadImpl when the file prefix is the BOM marker + the correct magic cookie.