validate icon indicating copy to clipboard operation
validate copied to clipboard

Enforce UTF-8 restriction on UTF8_* data types

Open c-suh opened this issue 3 years ago • 3 comments

Issue discovered by NSSDCA when a submitted bundle failed processing. Details here


Expected

Per https://pds.nasa.gov/datastandards/documents/im/current/index_1F00.html, we need to enforce UTF8 for fields that specify UTF8_* as their data type.

  • UTF8_Short_String_Collapsed
  • UTF8_Short_String_Preserved
  • UTF8_String
  • UTF8_Text_Collapsed
  • UTF8_Text_Preserved

Test data

bundle_kaguya_derived_HasSmartQuotes.xml.txt

this product fails because it uses smart quotes in the description

$ perl -ne 'print "$. $_" if m/[\x80-\xFF]/' /Users/jpadams/test/utf8_issue/bundle_kaguya_derived_HasSmartQuotes.xml
22           The SELENE mission was the Japanese mission to the Moon with a main orbiter “Kaguya” and

c-suh avatar Mar 05 '21 16:03 c-suh