quick-lint-js icon indicating copy to clipboard operation
quick-lint-js copied to clipboard

10$: config: warn on invalid identifier characters in global name

Open strager opened this issue 2 years ago • 3 comments

{
  "globals": {
    "jest.": true
  }
}

The above should warn about the . in the global variable name.

strager avatar Aug 01 '22 01:08 strager

Hello! I would like to try to work on this as my first contribution. This is part of a school project, so I am inexperienced but will give it my best! I have started going through the code to understand how it works and where this fix might fit in. As I understand it, all variable names (global or no) can't include the . character in javascript. The parse_identifier functions in lex.cpp should issue the "diag_character_disallowed_in_identifiers" error/warning if a . is found in an identifier, right? Is this working for local variables but failing for global ones? I'm hoping you might be able to point me in the right direction.

kptil avatar Sep 15 '22 14:09 kptil

The parse_identifier functions in lex.cpp should issue the "diag_character_disallowed_in_identifiers" error/warning if a . is found in an identifier, right?

That is correct.

Is [parse_identifier] working for local variables but failing for global ones?

lexer::parse_identifier is called only for variables found in .js files. For config files, lexer::parse_identifier isn't used at all. Instead, config files use a JSON parser. The code for config files is in src/quick-lint-js/configuration/configuration.cpp. configuration::load_globals_from_json is the function which takes the string from JSON and puts it in the global_declared_variable_set:

https://github.com/quick-lint/quick-lint-js/blob/master/src/quick-lint-js/configuration/configuration.cpp#L255-L258

https://github.com/quick-lint/quick-lint-js/blob/master/src/quick-lint-js/configuration/configuration.cpp#L286-L290

What we need to do is add a new diagnostic type specific for configs, then do validation in the config code. We can probably reuse some logic from lex.cpp (e.g. QLJS_CASE_IDENTIFIER_START, lexer::is_initial_identifier_character, and lexer::is_identifier_character), but I don't think we can reuse lexer::parse_identifier.

Here are some test cases:

valid:

{ "globals": { "jest": true } }
{ "globals": { "ABC123_$": true } }
{ "globals": { "\u0065": true } }
{ "globals": { "㘄": true } }
{ "globals": { "\u3604": true } }

invalid:

{ "globals": { "jest.": true } }
{ "globals": { "123ABC": true } }
{ "globals": { "  hello": true } }
{ "globals": { "\\u0065": true } }
{ "globals": { "\\u{65}": true } }
{ "globals": { "\\": true } }
{ "globals": { "☄": true } }
{ "globals": { "\u2604": true } }

strager avatar Sep 15 '22 18:09 strager

Thank you, this is extremely helpful! I will dig in and see what I can do. I think I understand what needs to be done.

kptil avatar Sep 16 '22 23:09 kptil