[#5152] Write preprocessed P4 to `<program_name>.p4pp` file when `--save-temps` option is provided
Closes #5152.
Marking as draft until we decide:
- What to name the option - I have named it
-P(for "Preprocessed") for now, but @asl also suggested--save-temps. - What to name the generated file - I named it
repro.p4(short for "reproducer"), but maybe there is a better name.
Not sure why the output should be named repro.p4. Normally for gcc / clang we're having:
- Option is named
--save-temps - For an input file named "foo.c" the preprocessed output is named
foo.i.
Not sure why the output should be named
repro.p4. Normally for gcc / clang we're having:
- Option is named
--save-temps- For an input file named "foo.c" the preprocessed output is named
foo.i.
Sure, that sounds reasonable to me.
Isn't there already a -E option that just runs the preprocessor and writes that to stdout?
@ChrisDodd Yes, but the compiler exits immediately after doing so - the purpose of this option is to be able to save the preprocessed P4 and continue with compilation instead of exiting.
Not sure if it matters, but the existing Tofino implementation names this file as foo.p4pp
The other thing I'd strongly recommend if you want to do it in the frontend: sanitize the preprocessed file by not including the standard system includes (i.e. core.p4 and/or your own architectural files), or at least have an option to do that. That means the file will have the user's code preprocessed, but will still use #include <core.p4> and #include <arch.p4>.
This will allow to reproduce old issues much easier or notice compatibility issues whenever you change your arch files. Believe me, you'd like you did it.
Not sure if it matters, but the existing Tofino implementation names this file as
foo.p4pp
Giving the generated preprocessed file a .p4pp suffix sounds good to me.
The other thing I'd strongly recommend if you want to do it in the frontend: sanitize the preprocessed file by not including the standard system includes (i.e. core.p4 and/or your own architectural files), or at least have an option to do that. That means the file will have the user's code preprocessed, but will still use #include <core.p4> and #include <arch.p4>.
This will allow to reproduce old issues much easier or notice compatibility issues whenever you change your arch files. Believe me, you'd like you did it.
I currently have no strong motivation to do that, but I am not opposed if someone else wants to add an option to do that in the future.
My opinion is that --save-temps sounds like a reasonable option and I would say .p4pp is better than just .p4.
I have no strong opinion about leaving system includes, although having it configurable (maybe with default being "leaving" #include) could be best. The disadvantage of desugaring back to #include is that it can hide modifications (which is probably irrelevant for just preprocessing) and obviously it excludes the text of the includes so some issues might be accidentally resolved by later changes.
Updates:
- Renamed the option to
--save-temps - Renamed the generated file to
filename.p4pp - Original code was broken because
inwas consumed forrepro.p4generation, and was empty by the time actual program parsing occurred. The new code runscpptwice and uses a secondFILEstream forfilename.p4ppgeneration instead.
I think the double preprocessing is a bit unfortunate (it is wasteful and in an extreme case, they could produce different results because of changes on the filesystem). Maybe instead if we have
--save-tempswe should direct the preprocessor output to thep4ppfile and then open that file for the parser.
This does seem better to me too. Done.