scout icon indicating copy to clipboard operation
scout copied to clipboard

UTF-8 en-dash used in ECM definition file names

Open dewittpe opened this issue 3 years ago • 4 comments

There are several ECM definition files with a UTF-8 en-dash in the file name. This is a problem because the human readable file name is not what a end use will type. For example, below I just try to show the top six lines of a file via head. I type out the name explicitly and get an error that the file does not exist. But, if I use tab complete then the file does exist. The difference, the en-dash in DF-FS.

image

I'll work on getting a full list of the files with UTF-8 characters in the file name.

dewittpe avatar Sep 08 '22 17:09 dewittpe

These are all the files with non ascii characters in the file name:

peterdewitt@D2V-MPB2 [~/NREL/scout] ((v0.8))
$ find . -name '*[! -~]*'
./ecm_definitions/Prospective Com. Ctls. (ASHP–LFL) CC.json
./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–RST) CC.json
./ecm_definitions/Prospective Commercial Ctls. (ASHP–RST).json
./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–FS) CC.json
./ecm_definitions/Prospective Residential Ctls. (ASHP–RST).json
./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–LFL).json
./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–FS).json
./ecm_definitions/Prospective Com. Ctls. (ASHP–RST) CC.json
./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–LFL) CC.json
./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–LFL).json
./ecm_definitions/Prospective Residential Ctls. (ASHP–LFL).json
./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–LFL) CC.json
./ecm_definitions/Prospective Res. Ctls. (ASHP–RST) CC.json
./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–RST).json
./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–FS).json
./ecm_definitions/Prospective Commercial Ctls. (ASHP–LFL).json
./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–FS) CC.json
./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–RST).json
./ecm_definitions/Prospective Res. Ctls. (ASHP–LFL) CC.json
./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–RST) CC.json
./ecm_definitions/Best Res. HPWH (EE+DF–FS).json

dewittpe avatar Sep 08 '22 17:09 dewittpe

This is of concern as the name in the files also have the UTF-8 encoded en-dash which will likely cause unexpected errors down the road when someone writes a default ASCII en-dash from the keyboard and then the string matching fails.

dewittpe avatar Sep 08 '22 17:09 dewittpe

@dewittpe what's the suggested fix here?

jtlangevin avatar Oct 12 '22 10:10 jtlangevin

The solution is to replace the non-ascii characters within the file names and within the file contents with ascii characters.

Additionally, whatever process, IDE, person, is responsible for allowing the non-ascii en-dash in the file name and file contents needs to be identified and corrected so that this doesn't happen when other files are updated.

A snippet of R code is provided below to make it easy to find non-ascii characters in files.

There are non-ascii characters in other files! This is a good example of why this is important to fix. One of the problems is in ./ecm_definitions/package_ecms.json where several of the ecms are listed with the non-ascii en-dash. This is "good" in the current configuration but if an end user were to type out these file names in package_ecms.json there would be an issue as the files would not be found.

x <- list.files(".", pattern = "\\.json$", full.names = TRUE, recursive = TRUE)
x <- setNames(lapply(x, tools::showNonASCIIfile), x)

x[sapply(x, length) > 0]

With the output being:

> x <- list.files(".", pattern = "\\.json$", full.names = TRUE, recursive = TRUE)
> x <- setNames(lapply(x, tools::showNonASCIIfile), x)
2:   "name": "Best Com. ASHP, Env., PC (EE+DF<e2><80><93>FS) CC",
2:   "name": "Best Com. ASHP, Env., PC (EE+DF<e2><80><93>FS)",
2:   "name": "Best Com. ASHP, Env., PC (EE+DF<e2><80><93>LFL) CC",
2:   "name": "Best Com. ASHP, Env., PC (EE+DF<e2><80><93>LFL)",
2:   "name": "Best Com. ASHP, Env., PC (EE+DF<e2><80><93>RST) CC",
2:   "name": "Best Com. ASHP, Env., PC (EE+DF<e2><80><93>RST)",
2:   "name": "Best Res. ASHP, Env., PC (EE+DF<e2><80><93>FS) CC",
2:   "name": "Best Res. ASHP, Env., PC (EE+DF<e2><80><93>FS)",
2:   "name": "Best Res. ASHP, Env., PC (EE+DF<e2><80><93>LFL) CC",
2:   "name": "Best Res. ASHP, Env., PC (EE+DF<e2><80><93>LFL)",
2:   "name": "Best Res. ASHP, Env., PC (EE+DF<e2><80><93>RST) CC",
2:   "name": "Best Res. ASHP, Env., PC (EE+DF<e2><80><93>RST)",
2:   "name": "Best Res. HPWH (EE+DF<e2><80><93>FS)",
8:   "_description": "Switch to best available HPWH with grid<e2><80><93>responsive controls",
33:       "title": "Standard 90.1-2013 <e2><80><93> Energy Standard for Buildings Except Low-Rise Residential Buildings",
40:       "title": "Standard 90.1-2019 <e2><80><93> Energy Standard for Buildings Except Low-Rise Residential Buildings",
103:     "Prospective Residential Ctls. (ASHP<e2><80><93>LFL)",
114:     "Prospective Residential Ctls. (ASHP<e2><80><93>RST)",
125:     "Prospective Res. Ctls. (ASHP<e2><80><93>LFL) CC",
136:     "Prospective Res. Ctls. (ASHP<e2><80><93>RST) CC",
250:     "Prospective Commercial ASHP (LFL)", "Prospective Commercial Ctls. (ASHP<e2><80><93>LFL)",
260:     "Prospective Commercial ASHP (RST)", "Prospective Commercial Ctls. (ASHP<e2><80><93>RST)",
271:     "Prospective Com. Ctls. (ASHP<e2><80><93>LFL) CC",
282:     "Prospective Com. Ctls. (ASHP<e2><80><93>RST) CC",
2:   "name": "Prospective Com. Ctls. (ASHP<e2><80><93>LFL) CC",
2:   "name": "Prospective Com. Ctls. (ASHP<e2><80><93>RST) CC",
2:   "name": "Prospective Commercial Ctls. (ASHP<e2><80><93>LFL)",
2:   "name": "Prospective Commercial Ctls. (ASHP<e2><80><93>RST)",
2:   "name": "Prospective Res. Ctls. (ASHP<e2><80><93>LFL) CC",
2:   "name": "Prospective Res. Ctls. (ASHP<e2><80><93>RST) CC",
2:   "name": "Prospective Residential Ctls. (ASHP<e2><80><93>LFL)",
2:   "name": "Prospective Residential Ctls. (ASHP<e2><80><93>RST)",
86:     "cooling": "Applies to all homes with cooling (using the indicated technologies<e2><80><94>excludes room ACs)",
> 
> x[sapply(x, length) > 0]
$`./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–FS) CC.json`
[1] "  \"name\": \"Best Com. ASHP, Env., PC (EE+DF–FS) CC\","

$`./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–FS).json`
[1] "  \"name\": \"Best Com. ASHP, Env., PC (EE+DF–FS)\","

$`./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–LFL) CC.json`
[1] "  \"name\": \"Best Com. ASHP, Env., PC (EE+DF–LFL) CC\","

$`./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–LFL).json`
[1] "  \"name\": \"Best Com. ASHP, Env., PC (EE+DF–LFL)\","

$`./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–RST) CC.json`
[1] "  \"name\": \"Best Com. ASHP, Env., PC (EE+DF–RST) CC\","

$`./ecm_definitions/Best Com. ASHP, Env., PC (EE+DF–RST).json`
[1] "  \"name\": \"Best Com. ASHP, Env., PC (EE+DF–RST)\","

$`./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–FS) CC.json`
[1] "  \"name\": \"Best Res. ASHP, Env., PC (EE+DF–FS) CC\","

$`./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–FS).json`
[1] "  \"name\": \"Best Res. ASHP, Env., PC (EE+DF–FS)\","

$`./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–LFL) CC.json`
[1] "  \"name\": \"Best Res. ASHP, Env., PC (EE+DF–LFL) CC\","

$`./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–LFL).json`
[1] "  \"name\": \"Best Res. ASHP, Env., PC (EE+DF–LFL)\","

$`./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–RST) CC.json`
[1] "  \"name\": \"Best Res. ASHP, Env., PC (EE+DF–RST) CC\","

$`./ecm_definitions/Best Res. ASHP, Env., PC (EE+DF–RST).json`
[1] "  \"name\": \"Best Res. ASHP, Env., PC (EE+DF–RST)\","

$`./ecm_definitions/Best Res. HPWH (EE+DF–FS).json`
[1] "  \"name\": \"Best Res. HPWH (EE+DF–FS)\","                                          
[2] "  \"_description\": \"Switch to best available HPWH with grid–responsive controls\","

$`./ecm_definitions/Commerical Lighting, 90.1 c. 2019.json`
[1] "      \"title\": \"Standard 90.1-2013 – Energy Standard for Buildings Except Low-Rise Residential Buildings\","
[2] "      \"title\": \"Standard 90.1-2019 – Energy Standard for Buildings Except Low-Rise Residential Buildings\","

$`./ecm_definitions/package_ecms.json`
[1] "    \"Prospective Residential Ctls. (ASHP–LFL)\","                                      
[2] "    \"Prospective Residential Ctls. (ASHP–RST)\","                                      
[3] "    \"Prospective Res. Ctls. (ASHP–LFL) CC\","                                          
[4] "    \"Prospective Res. Ctls. (ASHP–RST) CC\","                                          
[5] "    \"Prospective Commercial ASHP (LFL)\", \"Prospective Commercial Ctls. (ASHP–LFL)\","
[6] "    \"Prospective Commercial ASHP (RST)\", \"Prospective Commercial Ctls. (ASHP–RST)\","
[7] "    \"Prospective Com. Ctls. (ASHP–LFL) CC\","                                          
[8] "    \"Prospective Com. Ctls. (ASHP–RST) CC\","                                          

$`./ecm_definitions/Prospective Com. Ctls. (ASHP–LFL) CC.json`
[1] "  \"name\": \"Prospective Com. Ctls. (ASHP–LFL) CC\","

$`./ecm_definitions/Prospective Com. Ctls. (ASHP–RST) CC.json`
[1] "  \"name\": \"Prospective Com. Ctls. (ASHP–RST) CC\","

$`./ecm_definitions/Prospective Commercial Ctls. (ASHP–LFL).json`
[1] "  \"name\": \"Prospective Commercial Ctls. (ASHP–LFL)\","

$`./ecm_definitions/Prospective Commercial Ctls. (ASHP–RST).json`
[1] "  \"name\": \"Prospective Commercial Ctls. (ASHP–RST)\","

$`./ecm_definitions/Prospective Res. Ctls. (ASHP–LFL) CC.json`
[1] "  \"name\": \"Prospective Res. Ctls. (ASHP–LFL) CC\","

$`./ecm_definitions/Prospective Res. Ctls. (ASHP–RST) CC.json`
[1] "  \"name\": \"Prospective Res. Ctls. (ASHP–RST) CC\","

$`./ecm_definitions/Prospective Residential Ctls. (ASHP–LFL).json`
[1] "  \"name\": \"Prospective Residential Ctls. (ASHP–LFL)\","

$`./ecm_definitions/Prospective Residential Ctls. (ASHP–RST).json`
[1] "  \"name\": \"Prospective Residential Ctls. (ASHP–RST)\","

$`./web/ecm_definitions/Prospective Residential Integrated ASHP.json`
[1] "    \"cooling\": \"Applies to all homes with cooling (using the indicated technologies—excludes room ACs)\","

dewittpe avatar Oct 12 '22 15:10 dewittpe