zig icon indicating copy to clipboard operation
zig copied to clipboard

Rewrite `generate_linux_syscalls.zig`

Open ippsav opened this issue 1 year ago • 13 comments

Related: #20801, #20869

Closes: https://github.com/ziglang/zig/issues/20801

Description:

This PR refactors the syscall generation tool aiming to reduce code duplication for both the table based arches and the ones generated using a preprocessor.

The ArchInfo union contains 2 tags .table and .preprocessor

To add an Arch that depends on a tbl file, use the following structure:

    table: struct {
        name: []const u8, // Architecture name
        enum_name: []const u8, // Name of the generated enum
        file_path: []const u8, // Path to the tbl file for enum generation
        header: ?[]const u8, // Optional header to prepend to the enum
        extra_values: ?[]const u8, // This is used to append values at the end of the enum after processing the file mainly
                                  // to avoid conflicts with the generated enums in current master
        process_file: ProcessTableBasedArchFileFn,  // Function to process the tbl file and write the enum
                                                    // Can be customized for specific architecture needs
        filters: Filters, // Different checks on the abi / name tag to modify / skip certain values
        additional_enum: ?[]const u8, // Optional additional enum (see PowerPC architecture)
    },

To add an Arch that is generated using a preprocessor, use the following structure

    preprocessor: struct {
        name: []const u8, // Architecture name
        enum_name: []const u8, // The name of the enum that will be generated
        file_path: []const u8, // The path to the header file that will be used
        child_options: struct {
            comptime additional_args: ?[]const []const u8 = null,  // Additional args that will be passed to zig cc
            target: []const u8, // The target architecture

            pub inline fn getArgs(self: *const @This(), zig_exe: []const u8, file_path: []const u8) []const []const u8 {
                const additional_args: []const []const u8 = self.additional_args orelse &.{};
                return .{ zig_exe, "cc" } ++ additional_args ++ .{ "-target", self.target } ++ default_args ++ .{file_path};
            }
        },
        header: ?[]const u8, // Optional header to prepend to the enum
        extra_values: ?[]const u8, // this is used to append values at the end of the enum after processing the file mainly to 
                                  // avoid conflicts with the generated enums in master
        process_file: ProcessPreprocessedFileFn, // Function to process the header file and write the enum
                                                  // Can be customized for specific architecture needs
        additional_enum: ?[]const u8,

The generated enum will be in this format:

const EnumName = enum {
  // header_values
  
  // values parsed from the processing function
  
  // additional values added at the end
}

How it's tested:

Ran the tool and generated the enums and diffed them with lib/std/os/linux/syscalls.zig from master to make sure everything works the same

ippsav avatar Aug 01 '24 13:08 ippsav

Other than style nitpicks, this looks reasonable to me.

Just as a sanity check, does it produce a byte-for-byte identical syscalls.zig?

alexrp avatar Aug 01 '24 19:08 alexrp

Other than style nitpicks, this looks reasonable to me.

Just as a sanity check, does it produce a byte-for-byte identical syscalls.zig?

Does the hexagon arch work ? the only difference is the order of enums is different

ippsav avatar Aug 01 '24 19:08 ippsav

Does the hexagon arch work ?

Yes, #20798 was merged, so a freshly built compiler should work.

the only difference is the order of enums is different

It would be nice to maintain the current order just to reduce diff churn on the next syscalls update. But if that requires more work than just reordering the arch_infos, then eh, don't bother.

alexrp avatar Aug 01 '24 19:08 alexrp

I ll recompile it on linux and test it and ensure all the values are in the same order

ippsav avatar Aug 01 '24 21:08 ippsav

@alexrp, After the changes there is no difference between syscalls.zig and the newly generated one image

ippsav avatar Aug 01 '24 21:08 ippsav

I appreciate you and @alexrp taking the time to extend/simplify the code, but I think these are the wrong abstractions for the table-based arch's. Given each table has the same structure of:

<number> <abi> <name> [everything else]

I think you can have a function that splits all the lines into these three strings, and then passes them into functions like so:

const Result = union(enum) {
    entry: []const u8,
    skip: void,
    done: void,
};

*const fn (number: usize, abi: []const u8, name: []const u8, buf: []u8) Result;

That way you can pair everything down to a helper function and a couple of arch-specific Result functions. Besides, a little copy-pasta never hurt anyone ;).

I'm not sure I understood what you meant by your impl, what did you mean by three strings, and wouldn't this impl require more mem allocations for entry ? instead of writing it directly, you ll allocPrint entry and then pass it afterwards to the writer ?

ippsav avatar Aug 02 '24 12:08 ippsav

I'm happy to merge this when all the parties involved reach consensus.

andrewrk avatar Aug 02 '24 21:08 andrewrk

Once it's merged would there be interest in updating it to use aro instead of calling zig cc via a subprocess? I'd be happy to do it if so, I don't think it would take much work other than updating aro to support -dD (outputs both the #define directives and the result of preprocessing)

ehaas avatar Aug 02 '24 23:08 ehaas

I'm happy to merge this when all the parties involved reach consensus.

I have no strong feelings on the above suggestions. My primary concern was getting the copy/pasted code under control as it was getting annoying when I had to make edits to it. 🙂

alexrp avatar Aug 03 '24 02:08 alexrp

I have no objections on the output - just style. I'm tied up today so I don't mind if this goes in as is. I second @ehaas's comments about just using Aro or clang - I'd like it so that we could run a pre-processor and get a list of key-value definitions.

The-King-of-Toasters avatar Aug 03 '24 02:08 The-King-of-Toasters

To quote zig zen: Incremental improvements. 🙂 I think it'd be reasonable to merge this as is and then refine it in further PRs if there's interest.

alexrp avatar Aug 03 '24 02:08 alexrp

OK I went to merge this but then I realized it effectively has no commit message, not even a commit title ("rewrite $foo" does not contain any more information than the git commit metadata). Can you at least explain what this does? Note, you didn't say closes any particular issue, you just mentioned that it is related. So there's actually no description in the PR writeup or commit messages as to your goals, and whether they were achieved, and how you tested, etc.

andrewrk avatar Aug 03 '24 18:08 andrewrk

@andrewrk I've updated the PR description, sorry for the hassle i didn't think it need a description will make sure to be more specific next time on each change !

ippsav avatar Aug 03 '24 19:08 ippsav

@ippsav very minor issue I ran into in #21004: The resulting file has an extraneous newline at the end, which means zig fmt --check will fail if you don't fix that manually. (Not sure if that was there before.)

alexrp avatar Aug 08 '24 22:08 alexrp

@alexrp is the issue still existing in the PR the issue is in syscalls.zig right ? i can take a look and fix it

ippsav avatar Aug 08 '24 22:08 ippsav

I pushed a fixed syscalls.zig to that PR. But you can repro it by just copying the generated syscalls.zig on top of lib/std/os/linux/syscalls.zig in master and doing git diff - there will be an extra newline at the end.

alexrp avatar Aug 08 '24 22:08 alexrp

@alexrp it was there before as well, basically after each enum generated };\n\n is written at the end of it which makes the last enum to have a trailing new line that triggers the formatting error, I can fix that so that you no longer have to go format the file after each generation

ippsav avatar Aug 08 '24 23:08 ippsav

@alexrp can you try with this generate_linux_syscalls.zig and see if the issue is still persisting ? I've tried it should be good to go, I ll have to fix my zig build dependencies as I can't rebuild zig on my machine currently :/

PR: https://github.com/ziglang/zig/pull/21005

ippsav avatar Aug 08 '24 23:08 ippsav