Top comment causes error
When I set comp.langopts.preserve_comments = true and place a comment at the top, it shows the following message:
error: expected external declaration
// this comment causes error
The following is the test code to reproduce the bug:
test "top comment causes error: expected external declaration" {
const aro = @import("aro");
const code =
\\// this comment causes error
\\void foo() { }
\\// this one is fine
;
const allocator = std.testing.allocator;
var diagnostics: aro.Diagnostics = .{
.output = .{
.to_file = .{
.file = std.io.getStdErr(),
.config = .escape_codes,
},
},
};
defer diagnostics.deinit();
var comp = aro.Compilation.init(allocator, &diagnostics, std.fs.cwd());
defer comp.deinit();
comp.langopts.preserve_comments = true;
var buf = std.io.fixedBufferStream(code);
const file = try comp.addSourceFromReader(buf.reader(), "<BUFFER>", .user);
var pp = aro.Preprocessor.init(&comp, .default);
defer pp.deinit();
try pp.addBuiltinMacros();
const eof = try pp.preprocess(file);
try pp.addToken(eof);
var tree = try aro.Parser.parse(&pp);
defer tree.deinit();
try std.testing.expect(diagnostics.errors == 0);
}
preserve_comments can only be used for preprocessing-only (i.e. it cannot be used if you will be parsing the preprocessed output). The parser itself actually can't handle comments; it expects them to be removed during tokenization (this means that you can't get comment line/column info from the parser).
Your test does reveal a separate issue though - the second comment is not causing problems because a final, commented line is not preserved when using preserve_comments if it does not end in a newline.
// This is kept
void foo() { }
// This is lost (no newline at end)
arocc -C -E test.c
# 1 "test.c" 1
# 1 "<builtin>" 1
# 295 "<builtin>"
# 1 "<command line>" 1
# 1 "test.c" 2
// This is kept
void foo() { }
Like Evan said this is only valid when only preprocessing and there is an error if you try to do it via the CLI fatal error: invalid argument '-C' only allowed with '-E'. Did you have some specific use case you were trying to solve?
Your test does reveal a separate issue though - the second comment is not causing problems because a final, commented line is not preserved when using
preserve_commentsif it does not end in a newline.
Fixed in 26f8bca5fd28747dd087719801474f3b14a5a300
Like Evan said this is only valid when only preprocessing and there is an error if you try to do it via the CLI fatal error: invalid argument '-C' only allowed with '-E'. Did you have some specific use case you were trying to solve?
I see. I'm trying to create a tool that generate function prototypes from a C code, and I wanted to preserve the comments that precede the function definitions. What's the best way to do that?
Initially I tried using the aro.Tree.Index.loc(), but I get the following error when I call node_index.loc(&tree):
/home/ronald/.cache/zig/p/aro-0.0.0-JSD1QowDJwDniRbMfnQ6kvt5_4aBVpUetJvLWg0gFHyS/src/aro/Tree.zig:1704:57: error: expected enum or tagged union, found 'u32'
return tree.tokens.items(.loc)[@intFromEnum(tok_i)];
^~~~~
Patching loc to remove @intFromEnum seems to fix it. That aside, it seems it still parses the whole thing just fine with preserve_comments, I still managed to generate the function prototypes with comments, but yeah, it does emit a lot errors when I don't .output = .ignore, which I don't want.
I get the following error when I call
node_index.loc(&tree):/home/ronald/.cache/zig/p/aro-0.0.0-JSD1QowDJwDniRbMfnQ6kvt5_4aBVpUetJvLWg0gFHyS/src/aro/Tree.zig:1704:57: error: expected enum or tagged union, found 'u32' return tree.tokens.items(.loc)[@intFromEnum(tok_i)]; ^~~~~
I submitted a PR to fix that because I ran into the same issue https://github.com/Vexu/arocc/pull/885