zig stage2: --autofix proof-of-concept

Introduce the --autofix CLI flag when building an executable, object, or library.
Refactor std.zig.render to use a struct parameter to make it easier to add/remove fields from the struct.
Introduce a "Fixups" concept to std.zig.render which can perform edits while rendering code, while leaving the Ast read-only.
Add a fixup for adding a discard after a variable declaration.
Update the Module code to check for fixable errors after AstGen lowering.

Explanation

The purpose of this feature is to satisfy two use cases that traditionally have been at odds:

Some people want the guarantee that all Zig code they read has the property that there are no unused locals.
Some people find no value from such errors and want to not have to deal with them.

The problem with an --allow-unused flag is that people from the second group will use it, and then some projects will fail to compile without the flag enabled.

I like to think of Zig as having "inline warnings". The warnings are there, inside your source code, next to the relevant lines, ready to be noticed by diffs, code reviews, and when refactoring.

--autofix is a way for Zig to automatically insert inline warnings for those who wish to iterate quickly on a messy codebase.

Demo

[nix-shell:~/dev/zig/build-release]$ cat test.zig
const std = @import("std");

pub fn main() void {
    const hello = "hello";
    const world = "world";
    std.debug.print("hello, {s}!\n", .{world});
}

[nix-shell:~/dev/zig/build-release]$ stage4/bin/zig build-exe test.zig
test.zig:4:11: error: unused local constant
    const hello = "hello";
          ^~~~~

[nix-shell:~/dev/zig/build-release]$ stage4/bin/zig build-exe test.zig --autofix

[nix-shell:~/dev/zig/build-release]$ ./test
hello, world!

[nix-shell:~/dev/zig/build-release]$ cat test.zig
const std = @import("std");

pub fn main() void {
    const hello = "hello";
    _ = hello;
    const world = "world";
    std.debug.print("hello, {s}!\n", .{world});
}

Improvements that need to be made before this can be merged:

Introduce an error for "pointless discard" and a fixup for it so that --autofix can undo the effects of itself when a variable becomes used.
Support local variables as well as local constants.
Support captures in addition to local variables and constants.
Integrate properly with incremental compilation.
Integrate with the Zig build system.
Distinguish between AstGen errors that can be autofixed and those that cannot.
Remove std.debug.print statements.
Only perform fixups when all errors are autofixable errors. However, suppress all autofixable errors when reporting errors with --autofix.

Sep 10 '22 03:09 andrewrk

There was a suggestion in the chat while you where doing it and I think it could also be considered. It can improve the workflow when --autofix is not used.

The suggestion was to defer unused variable compiler error until the very end of the compilation.

For example, consider the following code:

const std = @import("std");

pub fn main() !void {
    const unused = "hello";
    const a = "hello";

    std.log.debug("argument error {}", a);
}

When building it, we get this error:

/home/kuon/Playground/test/src/main.zig:4:11: error: unused local constant
    const unused = "hello";
          ^~~~~~

Ok, this is fair, it's an error. But if I fix the error by adding _ = unused; I get:

/usr/lib/zig/std/fmt.zig:86:9: error: expected tuple or struct argument, found *const [5:0]u8
        @compileError("expected tuple or struct argument, found " ++ @typeName(ArgsType));

The idea would be to have this error in the first case, like this:

/home/kuon/Playground/test/src/main.zig:4:11: error: unused local constant
    const unused = "hello";
          ^~~~~~
!! unused variable detected, build will fail, continuing !!
/usr/lib/zig/std/fmt.zig:86:9: error: expected tuple or struct argument, found *const [5:0]u8
        @compileError("expected tuple or struct argument, found " ++ @typeName(ArgsType));

That would address two things:

It is annoying to go back to fix an unused variable only to realize there was another error
Sometime there are multiple unused variables, and it would be faster to have them all listed and correct them in one go (this is already the case, but in a single file)

Sep 10 '22 03:09 kuon

It is annoying to go back to fix an unused variable only to realize there was another error

Especially when you mame a typo. Saying 'bar is unused' is less helpful than saying 'unknown variable baz'. Because the second will directly point you to the line with the typo.

Sep 10 '22 09:09 gwenzek

I think it's a bit unexpected that compilation modifies the source code. Also it will require changing build code for optional adding --autofix flag. I would suggest having this functionality as a separate command, e.g. zig autofix, similarly to zig fmt. Then anyone who needs it can add a text editor shortcut, or even on save hook.

Sep 10 '22 12:09 misanthrop

const buffer_len = 2 * (N - 1);

// ... more code

fill_buffer(buf, len, 1); // hint: ought to be buffer_len

> error: unused local constant `buffer_len`
> Damn
> zig autofix

const buffer_len = 2 * (N - 1);
_ = buffer_len;

// ... more code

fill_buffer(buf, len, 1);

> Yay!
> ... segfault
> Hmm
> zig undo autofix
> error: no command `zig undo autofix`

Sep 10 '22 12:09 zzyxyzz

I think this is a good compromise. Being someone who regularly uses a simple text editor("xed") + commandline I really like that this solution is accessible without using an IDE and also can be added directly to the build.zig.

However I see two problems remaining:

I'd like to have more fine-grained control over autofix. Someone in the chat proposed --fix-unused and --fix-all which I think is a good idea. Here are some personal reasons why I think more control is needed:
- While I don't particularly like the unused variable error, I really like the ignored function return value. It made me discover that some C functions that I used over the years actually return an error I should better chek. So I would like to only use --fix-unused-var and not --fix-unused-return.
- I generally don't like a compiler or IDE to mess with my source code. I'm afraid that in the future autofix gains new functionality that break my personal programming style. Even well intentioned things like for example fixing mixed tabs/spaces might be problematic for people that for example use "tabs for indentation, spaces for alignment".
As someone who listens to compiler warnings, I don't like how it can hide unused variables. (This is not just a problem of autofix, I also didn't like this behaviour before) It's definitely better for reading because all unused variables are marked as such, but it hides potentially bugs from me(see zzyxyzz's comment) and it also hides dependent unused variables (like if I do a giant calculation and in the end I _ = ... the result). I want my code to look clean and honestly I don't like that my code might contain random blocks of unused code hidden behind a leftover _ = ... from prototyping/debugging/testing.

Of course, I could just grep these, but I don't want to grep before every commit. So I have a different idea. Hear me out:

There is three types of unused right now:
- There is the unused function parameter fn x(_: y) z. This one normally happens when using the function in some API that dictates the values and autofix shouldn't add this one(instead it should add _ = ... in the first line of the function).
- There is unused return value _ = x(y);. Again they happen when the programmer decides they don't need the return value that some API provided. In my opinion these don't hinder prototyping/debugging/testing code in any way. It doesn't randomly happen when I comment out code or when I'm not finished yet. (When I'm not finished I would write a var z = x(y); and autofix would add _ = z;)
- Then there is the normal unused variable _ = x;. This one mostly happens while prototyping/debugging/testing and outside of that it can always be removed by adding one of the above or removing or commenting the declaration code. There is in my opinion absolutely no need for this unused variables to exist outside of the fast path of prototyping/debugging/testing. I acknowledge that some people might want to keep these around to document the name of type fn x(_: y) z unused variables, but I think we should figure out a different solution for that (maybe something like fn x(_=something: y) z
So my suggestion would be to make autofix always generate type _ = x; and adding a compiler warning only for _ = x; and not the others. I think this is the best solution:
- We would keep the intentionality of fn x(_: y) z and _ = x(y); which would get lost if they would be generated by autofix.
- We would get "inline warnings" which improves readability for developers that like to ignore warnings.
- We would also get compiler warnings which makes the "inline warnings" significantly easier to find than "ready to be noticed by diffs, code reviews, and when refactoring."
- We wouldn't get warnings when unused stuff is required by an API.

Sep 10 '22 14:09 IntegratedQuantum

--autofix makes more sense as a flag for zig fmt given that zig fmt is already modifying the code when it comes to sorting pointer attributes, fixing syntax between versions, and so on. Compilation affecting the source code is unexpected.

It would also make the editor loop zig fmt --ast-check --autofix.

Sep 10 '22 16:09 tauoverpi

I thought of one use case I haven't seen mentioned yet: unused loop labels. For example:

pub fn main() !void {
    var i: usize = 0;
    outer: while (i < 10) : (i += 1) {
        // do the normal things

        // Experimental code here -->
        var j: usize = 0;
        while (j < 10) : (j += 1) {
            if (condition) {
                break :outer;
            }
        }
        // <--
    }
}

As it is now, if you comment out your experimental code you will get an unused while loop label error, which is problematic for the same reasons as unused variables errors. To "discard" it, I changed it to

outer: while (i < 10) : (i += 1) {
    if (false) break :outer;
    // do the normal things
    ...
}

which will trick the compiler into thinking it's used. Unlike discarding an unused variable with _ = foo, there is no "official" way to do discard a loop label, and this workaround seems pretty fragile: I can imagine the compiler getting better at comptime analysis in the future and concluding that the label is not used after all. Maybe we could use an "official" way to discard loop labels as well, although I would understand if that is deemed too niche.

I'm really excited to try --autofix out, just wanted to mention this additional use case since it is something I've run into in practice while writing Zig code 👍

Sep 10 '22 18:09 schmee

I thought of one use case I haven't seen mentioned yet: unused loop labels

I did mention them in private with the same way of ignoring them and the conclusion seemed to be that this wouldn't be handled by --autofix because it is control flow.

Sep 10 '22 19:09 Vexu

Hello Andrew, Thank you for all the hard work you and your team have put into Zig over the years. I'll try to keep this feedback short, as I believe your busy. I think this approach is non-ideal for the following reasons:

It introduces some amount of complication to the compiler and thus increases the maintenance effort for Zig team, and increases risk of errors in the compiler. This PR contains 1,492 lines of text changed. If I remember, a fork of the compiler, correctly disabling this error was less than 10 lines of text. Perhaps I am remembering wrong?
It unexpectedly modifies the code of the end user.

I think the goal to have no unused variables is an ideal that is great and well intentioned. As the ideal is currently implemented it is, unfortunately, slowing the adoption rate of Zig. My recommendation would be as follows:

Default Zig to error on unused variables. This leads Zig users to the "Pit of Success".
Provide a flag that disables various errors, including unused variables, in Zig that slow down development.

I believe most developers will use a language that allows their workflow to be unhindered. This recommendation would mean that some developers will release code with the flag enabled. Responsible developers wouldn't release code with the flag enabled. As someone who has and is working in heavily regulated industries, where languages such as C and C++ are in wide use, I can say that such flags help move development forward in environments where developer friction is very high for good reasons. Anything that can be done to reduce that friction, without affecting the end design/code/risk rating, is most welcome, and I think ultimately beneficial to the success of Zig.

Thank you for your time and the consideration of my feedback.

Regards, Aaron

Sep 10 '22 19:09 AaronViviano

As someone mentioned on Reddit, fixups done automatically by the compiler should probably have indicators:

_ = somevar; // [[autofix unused variable]]

This is important for 2 reasons:

The edits can be found easily at a later time to be reviewed, and don't blend in with intentional discards. Also enables potential undo of autofixes.
The edits aren't as behind your back, so it's less uncomfortable. It doesn't fully invalidate your mental cache of the source code.

And for all the things mentioned in this thread, I also think there's benefit to have fine grained control of which errors are being --autofixed.

Sep 11 '22 04:09 cdemirer

If there would be a command like this that modifies the source code, I would appreciate an interactive mode (maybe even default to be interactive unless overridden by --yes).

Sep 12 '22 12:09 f03el

Is there a plan to make this work with ZLS?

The workflow in editors like VSCode is decent for this in other languages. This is what TypeScript shows when you misspell a property name:

higher

Sep 13 '22 01:09 Jarred-Sumner

ZLS is a third party project - you'll have to ask them.

What I can tell you is the plan I have which is to expose a powerful server from the compiler with a bespoke API that is intended to be used directly by an IDE, supporting many workflows including the one in your screenshot.

LSP could be implemented as a proxy application between VSCode and the Zig Compiler Server (ZCS). Note that ZCS is currently vaporware although I have started working on it in a branch.

Sep 13 '22 02:09 andrewrk

Is there a plan to make this work with ZLS?

The workflow in editors like VSCode is decent for this in other languages. This is what TypeScript shows when you misspell a property name:

Auguste did a proof of concept auto fix of this a quick action already, don't know if he pushed it since this is dropping, you can ask him

Also, do we have a ZCS issue or something?

Sep 13 '22 17:09 Deecellar

Is there a plan to make this work with ZLS?

See https://github.com/zigtools/zls/pull/631; I closed it because I think autofix handles this usecase well, but I'll reopen it :) (Sorry for the offtopic message!)

Sep 14 '22 06:09 SuperAuguste

Autofixing the code is great, but even better if it don't require me to rebuild the project? if 'zig fmt' is only for style correction, would a new command 'zig autofix' for error correct not make more sense? This could potentielly be extended to incorporate all sorts of fixes. But why tie to the build step? Is this because it makes implementation easier?

Oct 08 '22 08:10 mollerhoj

2. Provide a flag that disables various errors, including unused variables, in Zig that slow down development.

How about not reporting the (style) errors when in debug mode.

doesn't slow anybody down during development
when ready for production, could run the --autofix switch or zig autofix and clean them all up
production code won't accidentally fail to build when compiled without the magic switch

Nov 16 '22 17:11 dvmason

Closing abandoned PR. I'll come back to it eventually (don't delete my branch!)

Jun 18 '23 00:06 andrewrk

zig zig copied to clipboard

stage2: --autofix proof-of-concept

Explanation

Demo

Improvements that need to be made before this can be merged:

zig
zig copied to clipboard