request: remove sanitize logic recently introduced
Hi! As part of #233, this piece of logic was introduced:
const nameSanitized = name.replace(/[^a-z0-9]/gi, '_').toLowerCase()
Unfortunately (for me :P ), that messes up with my snapshot filenames. I always give snapshots very descriptive names, like "User with write-access - Goes to menu, clicks button and presses ESC [dark mode (Chrome)]". Up to v5.0.0, that resulted in the same name being preserved. I mean, the file generated by cypress-visual-regression would end up being exactly: User with write-access - Goes to menu, clicks button and presses ESC [dark mode (Chrome)].png. However, because of the change I mentioned above, the filename is now
user_with_write_access___goes_to_menu__clicks_button_and_presses_esc__dark_mode__chrome__.png .
This has two drawbacks IMO:
- makes the filenames harder to read (for people)
- makes the test generating the snapshot quite hard to find. I mean, if I see the new file, and I want to find where in my codebase it's being generated, I have to guess what special chars are being replaced by each
_.
Moreover, the project already has sanitize-filename as a dependency, and it seems to be already using it to sanitize filenames here:
const fileName: string = sanitize(options.screenshotName)
With that in mind, is the sanitization logic added in #233 necessary? Or was it maybe just added out of caution?
Are you folks ok with me making a PR with this change?
Hi @lgenzelis ,
the change was made on purpose because we are writing files and if you don’t have any guard your named files can fail if you use (ilegal characters)[https://www.mtu.edu/umc/services/websites/writing/characters-avoid/] But thinking about your request , maybe it could be a good idea to have a regexp function by default and add a config parameter to override that function. What do you think @Zaista ?
Thanks @area73 ! But isn't that what you're already doing here: ?
the project already has sanitize-filename as a dependency, and it seems to be already using it to sanitize filenames here:
const fileName: string = sanitize(options.screenshotName)
I mean, sanitize-filename is already taking care of the guards you mention, isn't it?
Well, I think I understand why you added that, sanitize-filename doesn't seem to be doing what is should be doing. :/ I just tried using an illegal character (\) in v.5.0.0 and it just failed. Can I then suggest an alternative approach? 🙏 The characters that may give issues, depending on the OS, are:
-
< -
> -
: -
" -
/ -
\ -
| -
? -
*
With that into account, can we replace
const nameSanitized = name.replace(/[^a-z0-9]/gi, '_').toLowerCase()
by
const nameSanitized = name.replace(/[<>:"/\\|?*]/g, '_');
?
Ok that makes more sense to me, I remember that I had some issues with the naming.
But still, I think the best approach is the former regexp because you are letting out some other characters like emojis apostrophes , ñ, etc) you can see the list on the previous link
But as I said before I guess we can go by exporting something like “sanitizeRegexp” function and use it on nameSanitize.
By the way we should also investigate about how we are using the sanitize-filemame, because it looks to me that maybe we have something off
Idk, emojis, ñ, ', .. are all allowed in filenames, so I don't think that's an issue. But that's just my opinion :)
So, your proposal would be to keep the current sanitization, but allowing the user to pass a different sanitizeRegexp as an optional parameter, right? If so, would it be possible to remove the .toLowerCase() bit? So, the code would end up being something like
const nameSanitized = name.replace(sanitizeRegexp || /[^a-z0-9]/gi, '_');
I was pointing to this illegal characters article were they refer to emojis and many other illegal characters.
About toLowerCase, my idea is to pass a function with this signature. // string -> string So something like:
// string -> string
const sanitizeFileName = (fileName) => {
// do whatever transformation you need or just use an Identity to return unchanged string
return 'transformed-string'
}
That way we can keep the original function as it is, we don't introduce a breaking change and we give the oportunity to generate any output.
That way it will be resilient to any use case
About that article, I don't think it's correct. It even mentions blank spaces and ! as illegal chars. I've using those my whole life (macos, linux and windows), and never had an issue. I think it's more about "best practices" than anything else.
Cool, yes, I think that's a good idea :) And indeed, it wouldn't change the current behavior at all, which is great.
Would it be ok if make a PR for this? :) Or you'd rather make it yourself?
I would like to know what @Zaista has to say about adding a new config param since he is the one who's in charge of publishing the NPM.
For me, I'm more than ok to add that into the repo, I think it is a nice initiative, and I would like you @lgenzelis to make that PR for sure 🙌.
Just take into account that you will need to document it and make some test.
Thanks !
Cool! Let's wait for their opinion then.
Would it be ok if make a PR for this? :) Or you'd rather make it yourself?
Sure, please go for it. I don't have a lot of time in the next weeks, but I can release a version at least once you guys find an appropriate solution
Actually, after a little bit of thinking, I'd go for the less intrusive way that @lgenzelis suggested, so something like /[<>:"/\\|?*]/g, without lowercase part as well, as I don't see much benefit in sanitizing it. If the users end up with an error, they will have to adjust the name and that's pretty much it.
I'd also provide both the name and sanitizedName to the plugin so that the original name is printed out in error messages to the users.
Unless @area73 you see something else that I'm missing?
May be has sense option like sanitizeApply with default true or sanitizeIgnore with default false and in specific situations it will ignore sanitizing
Hi folks! Do you have an agreement on how it'd be best to proceed? :)
Hi, should be fixed with #259 in v5.0.3
Indeed, thanks Zaista! :D