compiler icon indicating copy to clipboard operation
compiler copied to clipboard

Implement warning 210

Open Daniel-Cortez opened this issue 4 years ago • 19 comments

What this PR does / why we need it:

This PR implements warning 210: possible use of symbol before initialization: "(identifier)". Although the text for this warning is present in sc5.c since the times when the language was called Small, for some reason the warning itself was never implemented.

This diagnostic should help to detect hidden bugs and oversights in scripts, which could be useful for both novice and experienced scripters.

For example, here's an interesting bug I was able to find in NGRP's code using this new diagnostic: https://github.com/NextGenerationGamingLLC/SA-MP-Development/blob/a059abaebda0f2dff3c2e0f95ee4c9c5696406c7/GMs/includes/timers.pwn#L2898 Variable fExpHealth is used inside an if expression before being assigned a value. This is obviously a typo; the author of this code intended to use fVehicleHealth instead.

Some other bugs I was able to find:

  • https://github.com/opencnr/gamemode/issues/12
  • https://github.com/RogueDrifter/Anti_cheat_pack/issues/17
  • https://github.com/AmirMohammadShafie/SampGamemodes/issues/1
  • https://github.com/zeelorenc/sf-cnr/issues/64
  • https://github.com/zeelorenc/sf-cnr/issues/65

Of course, I realize that some scripters tend to use new var; as a shortcut for new var = 0;, especially in loops, which may cause false-positive "warning 210" messages. This is why I disabled this warning for the following cases:

  • a variable is defined inside a for loop;
for (new n; n < 1000; n += 10) // 'warning 210' isn't issued for variable 'n'
  • a variable is incremented/decremented (using the ++/-- operator).
new num_drivers;
for (new i = 0; i < MAX_PLAYERS; ++i)
    if (GetPlayerState(i) == PLAYER_STATE_DRIVER)
        num_drivers++; // 'warning 210' isn't issued for variable 'num_drivers'
  • an array is passed to a native function whose array argument is defined with the const qualifier.
// in <a_samp> 'GetPlayerName' is defined with the 'name' parameter
// being mistakenly marked as 'const'
native GetPlayerName(playerid, const name[], len);
// ...
new pname[MAX_PLAYER_NAME + 1];
GetPlayerName(playerid, pname, sizeof(pname));// 'warning 210' isn't issued for array 'pname'

This should decrease the number of false-positives to a reasonable minimum and allow the user to concentrate on the actual bugs found with this new diagnostic.

Which issue(s) this PR fixes:

What kind of pull this is:

  • [ ] A Bug Fix
  • [x] A New Feature
  • [ ] Some repository meta (documentation, etc)
  • [ ] Other

Additional Documentation:

Daniel-Cortez avatar Jul 21 '19 16:07 Daniel-Cortez

It would be much easier to blacklist uses, rather than whitelist them. I don't think there are any others besides these:

if (uninitialised) // warning
FuncNotPassedByRef(uninitialised) // warning
lvalue = uninitialised; // warning

Basically only the places where the value is retrieved once and not modified. This may already be what you have, but best to check the two are equivalent. What would this PR do for:

FuncPassedByRef(uninitialised)

?

Y-Less avatar Jul 21 '19 19:07 Y-Less

Cleaned up the code and added a few workarounds to support detection of uninitialized arrays (previously the compiler was able to detect only single variables).

I also made the compiler not issue warning 210 if an array is passed to a native function, because some of SA-MP natives are not const-correct. Example:

// in <a_samp> 'GetPlayerName' is defined with the 'name' parameter
// being mistakenly marked as 'const'
native GetPlayerName(playerid, const name[], len);
// ...
new pname[MAX_PLAYER_NAME + 1];
GetPlayerName(playerid, pname, sizeof(pname));// 'warning 210' isn't issued for array 'pname'

Daniel-Cortez avatar Jul 24 '19 17:07 Daniel-Cortez

It would be much easier to blacklist uses, rather than whitelist them. I don't think there are any others besides these: if (uninitialised) // warning FuncNotPassedByRef(uninitialised) // warning lvalue = uninitialised; // warning

There might be more complex uses, like these:

if (initialized == 0 && uninitialized != 0) // warning
FuncNotPassedByRef(initialized + uninitialized); // warning
lvalue = initialized + uninitialized; // warning

This PR makes the compiler issue warning 210 in all of the mentioned cases.

What would this PR do for: FuncPassedByRef(uninitialized) ?

Depends on how exactly FuncPassedByRef is defined. It think it would be easier to demonstrate on examples:

forward FuncPassedByRef(&par);
// ...
FuncPassedByRef(uninitialized); // no warnings (the symbol is treated as modified)
forward FuncPassedByRef(const &par);
// ...
FuncPassedByRef(uninitialized); // warning 210 (the symbol is treated as read)
forward FuncPassedByRef(arr[]);
// ...
FuncPassedByRef(uninitialized); // no warnings
forward FuncPassedByRef(const arr[]);
// ...
FuncPassedByRef(uninitialized); // warning 210
native FuncPassedByRef(arr[]);
// ...
FuncPassedByRef(uninitialized); // no warnings
native FuncPassedByRef(const arr[]);
// ...
FuncPassedByRef(uninitialized); // no warnings either, because not all SA-MP natives are const-correct

Daniel-Cortez avatar Jul 24 '19 17:07 Daniel-Cortez

I also made the compiler not issue warning 210 if an array is passed to a native function, because some of SA-MP natives are not const-correct.

This was fixed last year:

https://github.com/sampctl/samp-stdlib/blob/master/a_players.inc#L656

Southclaws avatar Jul 25 '19 14:07 Southclaws

But still, not everyone uses sampctl or manually installs modified includes from that repo. Also, warning 239 was disabled for native functions for the same reason: https://github.com/pawn-lang/compiler/commit/8948f207ea9c7cef4e7808751a499310d1407148

Daniel-Cortez avatar Jul 25 '19 14:07 Daniel-Cortez

I don't think we should worry about people not using the latest versions of things on the latest version of something. If they want to use the new compiler, but not the new includes, that's on them. We shouldn't penalise those who do upgrade (by omitting diagnostics) for the sake of those who don't.

Yes, another warning did the same, before we fixed the includes. It should probably now be restricted more.

Y-Less avatar Jul 25 '19 18:07 Y-Less

Yes, another warning did the same, before we fixed the includes.

Err... no, it did so after the includes were fixed. The last const-correctness fix was merged on 17 Jun 2018: https://github.com/sampctl/samp-stdlib/pull/6 Warning 239 was disabled for natives a week later, on 24 Jun 2018: https://github.com/pawn-lang/compiler/commit/8948f207ea9c7cef4e7808751a499310d1407148 And I can't really say I liked that restriction for warning 239, as well as I wasn't in favor of doing the same in this PR, but I did so only to be consistent with what 239 already does.

Anyway, I made a PR that re-enables warning 239 for natives. It only modifies one line of code, so I believe it shouldn't take a lot of time to review and merge it. After it's merged, I'll re-enable warning 210 for natives in this PR too.

Daniel-Cortez avatar Jul 26 '19 13:07 Daniel-Cortez

I merged this and had a play. It is somewhat telling that when I tested this on YSI with tests enabled almost every warning instance was i, idx, pos or similar. Some false positives:

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_utils_impl.inc#L923-L931

I found many instances of false positives in while loops, this was just the first. Another one:

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_core_entry.inc#L172-L187

That's extremely common in YSI. Basically every y_amx use looks like this.

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Data/y_bit/y_bit_impl.inc#L232-L242

count var modified with +=, similar to the ++ exception you already have.

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_amx_impl.inc#L967-L969

This one is questionable, since the modification is in assembly. Were you to ignore that I'd entirely understand.

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Coding/y_inline/y_inline_impl2.inc#L816-L822

Another highly questionable one. All that variable does is exist to be destructed. Should destructors trigger this warning? I don't know. This one can be silenced with = {}, and again is a bizzare corner-case probably.

There are a fair number in the y_foreach tests, which at first I wasn't sure what to do about, since the count arrays are default zeroed, and it isn't a mistake. But most of them look like this:

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Data/y_foreach/y_foreach_tests.inc#L424-L428

That's an assertation that an array which should be initially empty, is. I guess that's fine to have a warning like this on since I'm checking that nothing truly horrible has happened in compilation. But there are also many other places with arrays (especially globals) declared and left as-is. The problem is that no explicit initialisation is not only common, but valid and well defined.

Y-Less avatar Sep 23 '19 22:09 Y-Less

Also a definite bug. If this PR is merged to HEAD now, I get:

YSI_Core\y_core\y_utils_impl.inc(597) : warning 214: possibly a "const" array argument was intended: "str"

On:

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_utils_impl.inc#L597

I do not (and shouldn't) get it before merging this code.

Y-Less avatar Sep 23 '19 23:09 Y-Less

Must be due to a workaround I used for arrays (which I had to add because function address() from sc4.c marks arrays as uWRITTEN on all types of access, even when an array cell is only read). In any case, I'll look into it.

Daniel-Cortez avatar Oct 14 '19 04:10 Daniel-Cortez

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_utils_impl.inc#L923-L931 https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_core_entry.inc#L172-L187 https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Data/y_bit/y_bit_impl.inc#L232-L242

I made the compiler not issue warning 210 if a variable is incremented or decremented with ++/--, but in all of the abovementioned cases a local variable is used before being incremented/decremented. The problem is that when a global variable is used somewhere, the compiler is able to know in advance if the said variable is going to be incremented/decremented later (thanks to 2-pass parsing), but it can't do the same for locals. I think there's not much to do about it. Implementing a 2-pass parsing on per-function level could help to address this issue (and a number of others), but that would likely negatively affect the compilation performance.

count var modified with +=, similar to the ++ exception you already have. https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_amx_impl.inc#L967-L969

I intentionally didn't apply the same exception for +=, -= and the other similar operators. ++ and -- are very oftenly used on loop counter variables (thus making about 90-95% of all false-positives), while +=, -= and the likes aren't used in loops that frequently, so their use on uninitialized variables is much more suspicious.

This one is questionable, since the modification is in assembly. Were you to ignore that I'd entirely understand. https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_amx_impl.inc#L967-L969

Umm... but I did make an exception for assembly? The warning probably comes from str[idx] with idx being uninitialized.

Another highly questionable one. All that variable does is exist to be destructed. Should destructors trigger this warning? I don't know. This one can be silenced with = {}, and again is a bizzare corner-case probably. https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Coding/y_inline/y_inline_impl2.inc#L816-L822

I think the real problem here is that in this case the compiler doesn't issue a warning about an array being unused because it has a destructor. Sometimes variables might be left over from refactoring, so if there's an unused variable that has a destructor (and thus being unnoticed), at least warning 210 would help to locate it.

The problem is that no explicit initialisation is not only common, but valid and well defined.

AFAIK, the official documentation (pawn-lang.pdf) only briefly mentions that Unless it is explicitly initialized, the value of the new variable is zero. But the same documentation also well describes warning 210:

210 possible use of symbol before initialization: identifier

A local (uninitialized) variable appears to be read before a value is
assigned to it. The compiler cannot determine the actual order of
reading from and storing into variables and bases its assumption of
the execution order on the physical appearance order of statements
an expressions in the source file.

It even tells that there might be false-positives due to the way how the compiler works.

Another question is why this warning ended up unimplemented? It always seemed strange to me that the compilers for a lot of other languages such as C, C++ (MSVC, GCC, Clang), Pascal (FPC) etc. already have this basic diagnostic, and Pawn - "a robust language with a compiler that performs a maximum of static checks" - for some reason doesn't have it.

Also a definite bug. If this PR is merged to HEAD now, I get:

YSI_Core\y_core\y_utils_impl.inc(597) : warning 214: possibly a "const" array argument was intended: "str"

On:

https://github.com/pawn-lang/YSI-Includes/blob/771b7b027cad95da1ec4210b9791eddf7b3b97d3/YSI_Core/y_core/y_utils_impl.inc#L597

I do not (and shouldn't) get it before merging this code.

It seems that I forgot to mark arrays as uWRITTEN in a rather rare case whence an array cell (in this case it's str[0]) is passed to a function in place of an array argument. This should be fixed now.

Daniel-Cortez avatar Oct 26 '19 18:10 Daniel-Cortez

This issue has been automatically marked as stale because it has not had recent activity.

stale[bot] avatar Mar 18 '20 16:03 stale[bot]

I'm still slightly wary of this, so I'd like to know what others think. @Zeex @Southclaws @YashasSamaga etc? I just checked again and there are several places in YSI that kick up warnings in valid code even with the latest version. The cause of all those warnings is that 0 is explicitly listed as the default value for variables in pawn documentation, and a lot of code uses that fact, but this warning complains about code that's supposedly idiomatic. That could well be why it was never added in the first place.

Maybe it should be restricted only to variables whose first use is a read outside of a loop in which they are modified. At least to begin with. This is almost certainly not what is intended:

new a;
Func(a);

While this is more likely correct:

new a;
while (a < 11)
{
	Func(a);
	a = a * 3 - 1;
}

The first you can just use a constant, the second you can't catch every possible way in which a variable might be modified. Just see that it's written in the loop and thus assume it is written at every point in the loop.

Y-Less avatar May 30 '20 18:05 Y-Less

This issue has been automatically marked as stale because it has not had recent activity.

stale[bot] avatar Aug 28 '20 21:08 stale[bot]

Maybe it should be restricted only to variables whose first use is a read outside of a loop in which they are modified.

Sounds doable, but I'll need #533 being merged first, as it allows the compiler to memoize the compound statement nesting level for loops - without that there would be no way to determine if the variable was defined outside of the current loop or inside it.

Daniel-Cortez avatar Aug 29 '20 09:08 Daniel-Cortez

Maybe it should be restricted only to variables whose first use is a read outside of a loop in which they are modified.

Sounds doable, but I'll need #533 being merged first, as it allows the compiler to memoize the compound statement nesting level for loops

Done

Y-Less avatar Sep 19 '20 17:09 Y-Less

This issue has been automatically marked as stale because it has not had recent activity.

stale[bot] avatar Dec 19 '20 08:12 stale[bot]

I had another look at merging this one today. It's still quite controversial, simply due to the number of new warnings thrown up for idiomatic pawn code, so I'm not going to merge it unless other people chime in.

Having said that, a few of the tests after rebase are failing:

1>warning_210.pwn(62): warning 240: previously assigned value is never used (symbol "local_var4")

That warning doesn't even seem to make sense.

This warning:

1>warning_250_251.pwn(122): warning 250: variable "n" used in loop condition not modified in loop body

Is also not given any more, however, it's entirely possible that's my fault for a poor rebase.

The other warnings in tests were easily fixed by adding some initialisations.

Y-Less avatar May 02 '21 18:05 Y-Less

This issue has been automatically marked as stale because it has not had recent activity.

stale[bot] avatar Jan 09 '22 02:01 stale[bot]