Rewrite `generate_linux_syscalls.zig`
Closes: https://github.com/ziglang/zig/issues/20801
Description:
This PR refactors the syscall generation tool aiming to reduce code duplication for both the table based arches and the ones generated using a preprocessor.
The ArchInfo union contains 2 tags .table and .preprocessor
To add an Arch that depends on a tbl file, use the following structure:
table: struct {
name: []const u8, // Architecture name
enum_name: []const u8, // Name of the generated enum
file_path: []const u8, // Path to the tbl file for enum generation
header: ?[]const u8, // Optional header to prepend to the enum
extra_values: ?[]const u8, // This is used to append values at the end of the enum after processing the file mainly
// to avoid conflicts with the generated enums in current master
process_file: ProcessTableBasedArchFileFn, // Function to process the tbl file and write the enum
// Can be customized for specific architecture needs
filters: Filters, // Different checks on the abi / name tag to modify / skip certain values
additional_enum: ?[]const u8, // Optional additional enum (see PowerPC architecture)
},
To add an Arch that is generated using a preprocessor, use the following structure
preprocessor: struct {
name: []const u8, // Architecture name
enum_name: []const u8, // The name of the enum that will be generated
file_path: []const u8, // The path to the header file that will be used
child_options: struct {
comptime additional_args: ?[]const []const u8 = null, // Additional args that will be passed to zig cc
target: []const u8, // The target architecture
pub inline fn getArgs(self: *const @This(), zig_exe: []const u8, file_path: []const u8) []const []const u8 {
const additional_args: []const []const u8 = self.additional_args orelse &.{};
return .{ zig_exe, "cc" } ++ additional_args ++ .{ "-target", self.target } ++ default_args ++ .{file_path};
}
},
header: ?[]const u8, // Optional header to prepend to the enum
extra_values: ?[]const u8, // this is used to append values at the end of the enum after processing the file mainly to
// avoid conflicts with the generated enums in master
process_file: ProcessPreprocessedFileFn, // Function to process the header file and write the enum
// Can be customized for specific architecture needs
additional_enum: ?[]const u8,
The generated enum will be in this format:
const EnumName = enum {
// header_values
// values parsed from the processing function
// additional values added at the end
}
How it's tested:
Ran the tool and generated the enums and diffed them with lib/std/os/linux/syscalls.zig from master to make sure everything works the same
Other than style nitpicks, this looks reasonable to me.
Just as a sanity check, does it produce a byte-for-byte identical syscalls.zig?
Other than style nitpicks, this looks reasonable to me.
Just as a sanity check, does it produce a byte-for-byte identical
syscalls.zig?
Does the hexagon arch work ? the only difference is the order of enums is different
Does the hexagon arch work ?
Yes, #20798 was merged, so a freshly built compiler should work.
the only difference is the order of enums is different
It would be nice to maintain the current order just to reduce diff churn on the next syscalls update. But if that requires more work than just reordering the arch_infos, then eh, don't bother.
I ll recompile it on linux and test it and ensure all the values are in the same order
@alexrp, After the changes there is no difference between syscalls.zig and the newly generated one
I appreciate you and @alexrp taking the time to extend/simplify the code, but I think these are the wrong abstractions for the table-based arch's. Given each table has the same structure of:
<number> <abi> <name> [everything else]I think you can have a function that splits all the lines into these three strings, and then passes them into functions like so:
const Result = union(enum) { entry: []const u8, skip: void, done: void, }; *const fn (number: usize, abi: []const u8, name: []const u8, buf: []u8) Result;That way you can pair everything down to a helper function and a couple of arch-specific
Resultfunctions. Besides, a little copy-pasta never hurt anyone ;).
I'm not sure I understood what you meant by your impl, what did you mean by three strings, and wouldn't this impl require more mem allocations for entry ? instead of writing it directly, you ll allocPrint entry and then pass it afterwards to the writer ?
I'm happy to merge this when all the parties involved reach consensus.
Once it's merged would there be interest in updating it to use aro instead of calling zig cc via a subprocess? I'd be happy to do it if so, I don't think it would take much work other than updating aro to support -dD (outputs both the #define directives and the result of preprocessing)
I'm happy to merge this when all the parties involved reach consensus.
I have no strong feelings on the above suggestions. My primary concern was getting the copy/pasted code under control as it was getting annoying when I had to make edits to it. 🙂
I have no objections on the output - just style. I'm tied up today so I don't mind if this goes in as is. I second @ehaas's comments about just using Aro or clang - I'd like it so that we could run a pre-processor and get a list of key-value definitions.
To quote zig zen: Incremental improvements. 🙂 I think it'd be reasonable to merge this as is and then refine it in further PRs if there's interest.
OK I went to merge this but then I realized it effectively has no commit message, not even a commit title ("rewrite $foo" does not contain any more information than the git commit metadata). Can you at least explain what this does? Note, you didn't say closes any particular issue, you just mentioned that it is related. So there's actually no description in the PR writeup or commit messages as to your goals, and whether they were achieved, and how you tested, etc.
@andrewrk I've updated the PR description, sorry for the hassle i didn't think it need a description will make sure to be more specific next time on each change !
@ippsav very minor issue I ran into in #21004: The resulting file has an extraneous newline at the end, which means zig fmt --check will fail if you don't fix that manually. (Not sure if that was there before.)
@alexrp is the issue still existing in the PR the issue is in syscalls.zig right ? i can take a look and fix it
I pushed a fixed syscalls.zig to that PR. But you can repro it by just copying the generated syscalls.zig on top of lib/std/os/linux/syscalls.zig in master and doing git diff - there will be an extra newline at the end.
@alexrp it was there before as well, basically after each enum generated };\n\n is written at the end of it which makes the last enum to have a trailing new line that triggers the formatting error, I can fix that so that you no longer have to go format the file after each generation
@alexrp can you try with this generate_linux_syscalls.zig and see if the issue is still persisting ? I've tried it should be good to go, I ll have to fix my zig build dependencies as I can't rebuild zig on my machine currently :/
PR: https://github.com/ziglang/zig/pull/21005