naga
naga copied to clipboard
Implement module-level scoping
Try 2 :)
Fixes #2072 and #1745.
This PR separates parsing from lowering, by generating an AST, which desugars as much as possible down to something like Naga IR.
The AST is then used to resolve identifiers while lowering to Naga IR.
@jimblandy Would be great if you could a look at the AST definition!
Also, you were talking about how 2 passes would be all that was necessary. However, lowering to Naga IR requires that the dependencies of each module-level declaration are known beforehand, as we need to (reverse) topologically sort the declaration list so that all dependencies are processed before their dependents.
The only way I can think of doing this in two passes is to maintain a set of the raw identifiers referenced by each declaration during parsing, and then using those to sort the declarations, but I'm not quite sure if this will actually be faster than resolving on the same IR, and then sorting.
Here's how I was thinking it could work (I haven't actually tried this yet):
-
For the first pass, we parse mostly as we do now, except that identifiers in expressions, types, statements, etc. are represented by placeholders that just store the name. Except for the placeholders, we try to produce something as close to the
crateIR as possible immediately.The placeholders mean that we will have types that look very much like
crate::Expression, etc. except that certain variants that assume knowledge of definitions (LocalVariable, obviously, but also things likeAccessIndex, since we don't know struct type definitions) hold identifiers instead of handles. Some cases will need to change in more profound ways; for example, we can't tell whetherS(x)is aCallor aComposeuntil we know the definition ofS. But as much as possible, the unresolved types should resemble thecratetypes, so that people familiar with the rest of Naga can make assumptions. -
For the resolution pass, starting with the entry points, start building true
crate::{Function, EntryPoint, Type, ...}values. When we hit a placeholder, check to see if we already have a previously-resolved definition for that name. If so, use it. Otherwise, immediately recurse on resolving that name (which adds the resulting fully-resolved definition to theModuleunder construction), and then resume processing the placeholder, whose definition we now know. When we're done, add our completed definition to theModule.
This amounts to a depth-first traversal of the DAG of definitions, where we "mark nodes as visited" by adding resolved definitions to the Module. This ensures that definitions appear in the Module in dependency order, in the process of replacing placeholders with Handles.
After we've processed the entry points, we may want to just walk the top-level unresolved definitions and finish off anything unvisited. Such definitions could be safely dropped from the module, since they're unreachable from any entry point (and we should probably produce warnings about this), but it's very useful for testing to have everything in the input appear in the Module. Maybe this step would be optional.
Naturally, we will need to check for cyclic references at the point of recursion.
As I say, I haven't tried this, so there may be aspects of the problem I've missed - but it seems like something along these lines ought to be possible somehow.
You should use Arena, UniqueArena, and Handle for the unresolved representation. That is one of the tricks Naga uses for performance: broadly speaking, caches and prefetchers are much happier with arrays than heap allocation, especially if you can phrase some computation as an iteration over the array.
Otherwise, try to make the change as undisruptive and limited as you can while still preserving legibility.
Recursion was something I thought of, but processing each declaration does use quite a large amount of stack, so we might end up stack overflowing quite quick, especially on Windows, where the default stack size is 2 MB.
We can probably use stacker, which is what rustc uses.
About the warnings, I do plan to add error recovery to the frontend so we can output more than one error, and will add warnings for unused items there (I did have this implemented in my other frontend, it isn't too big of a change).
Using arenas instead of trees for expressions should reduce the stack depth, though, because you can just iterate over the arena directly. And with a little care we could avoid recursing on statements at all - I think only Call and Atomic need resolution. So the depth would only be determined by the number of references we're in the midst of resolving, and not so much by the interior structure of the functions themselves.
stacker sounds 1) amazing and 2) wildly unsafe and definitely not something we want to ship in a browser. So that's one easy solution out the window. I think stack depth is a legitimate concern here.
I think keeping a set of (textual) dependencies used by each declaration during parsing, and then using that to post-order depth-first sort would be the easiest solution, and it shouldn't really impact performance too much.
The post-order sorting with cycle detection does require recursion, however it should require much less stack space than resolving the function.
I have also implemented parsing and finalized the AST structures, would be great if you could take a look, and tell me if any changes are needed :) (I have removed parts of the code that aren't necessary for parsing, will modify and re-add them later)
I think keeping a set of (textual) dependencies used by each declaration during parsing, and then using that to post-order depth-first sort would be the easiest solution, and it shouldn't really impact performance too much.
If I understand what you mean here, this seems reasonable, too: basically, just treat every identifier use as an edge in a DAG constructed during parsing. It is definitely preferable to avoid the recursion. I'll take a look soon.
I did have to keep track of locals during the initial parse phase, to fix an issue with code like this:
fn a() {
let a = b();
}
fn b() -> i32 {
let a = 20;
return a;
}
If we just kept a set of textual dependencies, fn a being shadowed by let a would create a cyclic dependency error.
I'm curious if the ast generated here is publicly accessible. I do have a case where I need to look at the shader's ast to figure out how to setup bind groups
I'm curious if the ast generated here is publicly accessible. I do have a case where I need to look at the shader's ast to figure out how to setup bind groups
It's not public at the moment, but I can make it so.
However, can you not just use the Module that the front end produces instead? It will probably have more information than the semi-AST IR used here.
@jimblandy This has reached feature parity with the existing frontend, and passes all the functionality-based tests. The diagnostics snapshots don't pass yet, but fixing those shouldn't change too much. You can start reviewing now.
Surprisingly, performance has actually improved. The current frontend consistently takes around 3.6 ms for the front/wgsl benchmark, while this does it in around 2.9 ms (a 20% decrease).
The only difference in output is that I don't initialize LocalVariables with constants since I'm not too sure if that added complexity is actually worth it. I can add it back if it's something wanted, though.
Hey @SparkyPotato, thanks for this PR!
There seem to be quite a lot of copy-paste "changes" (code being moved around) that make it difficult to know what actually changed. Would it be possible for you to undo as many code moves as possible to make it easier to review this?
Right now, we are looking at having to review 11943 LOC (without test snapshots) which I don't consider particularly "reviewable".
@teoxoy I've moved as much as I think possible back (parsing + errors).
construction.rs hasn't been moved because I have made quite a few changes to it, and it also needs to be in the lower module for privacy reasons.
This brings the overall diff of src/front/wgsl to +5038 -3348.
Do tell me if there's anything else I need to do to make reviewing easier!
Thanks for reducing the diff, but it still looks like quite a lot of changes which will make browsing the history of the files challenging.
Let's move the code in lower/mod.rs to mod.rs and lower/construction.rs to construction.rs (with existing code where it previously was).
After this, the changes should be reviewable and after the review we can see how to proceed with new files/folders (and moving code around).
Got it down to +4495 -2811, not sure if anything more is possible.
Only user defined structs should have a crate::Type.name (some of the backends now generate extra type aliases for built-in types).
Wouldn't this lead to worse validation errors, where something like Vec { kind: Float, width: 4 } would be used instead of just vec4<f32>?
Wouldn't this lead to worse validation errors, where something like
Vec { kind: Float, width: 4 }would be used instead of justvec4<f32>?
I think for errors we can have a method that gives us the proper name. Also, considering we have other front-ends that don't use the same syntax (i.e. vec4<f32>) for types, we should avoid passing those through IR.
I think for errors we can have a method that gives us the proper name. Also, considering we have other front-ends that don't use the same syntax (i.e.
vec4<f32>) for types, we should avoid passing those through IR.
Done!
Thanks! There seem to be a few more places where we still set the name:
- 5 in
Lowerer.construct - 1 in
ExpressionContext.resolve_type
@teoxoy I have done a few things:
- Added
Typifier::get_handleand used that inExpressionContext. The GLSL frontend has not been edited. - Moved
ensure_type_existstoOutputContext, and used that everywhere. - Created a sort of workaround for two-stage borrows to avoid the need to clone
TypeInner- this does lead to code duplication, so let me know if you can think of a better solution.
- Added
Typifier::get_handleand used that inExpressionContext. The GLSL frontend has not been edited.
Feel free to change the GLSL frontend as well.
- Moved
ensure_type_existstoOutputContext, and used that everywhere.
We can make use of ensure_type_exists in a few more places:
- 2 in
ExpressionContext.create_zero_value_constant - 1 in
Lowerer.construct
- Created a sort of workaround for two-stage borrows to avoid the need to clone
TypeInner- this does lead to code duplication, so let me know if you can think of a better solution.
Can't think of doing it any other way right now since we match on those TypeInners but will keep it in mind.
Thanks for the updates!
The only difference in output is that I don't initialize
LocalVariables with constants since I'm not too sure if that added complexity is actually worth it. I can add it back if it's something wanted, though.
from https://github.com/gfx-rs/naga/pull/2075#issuecomment-1304453265
I also noticed this by looking at the test snapshots. I think it should be fine for now considering we'll have to deal with const expressions differently soon.
All changes done.
Updated perf numbers: 10-15% slower (which is quite the improvement from my last measurement of 30%). Most likely attributed to getting rid of those TypeInner.clone()s and not creating Strings for all Type.names.
Just a small note: it doesn't seem like GitHub will automatically close all the 10 referenced issues, so they're probably gonna have to be closed manually.
I'll do one last round of review tomorrow and take a look at all those issues as well. Will probably edit the top comment so that they will all get automatically closed.
Considering the size and scope of the PR, we have a lot to go through. Thanks for being so receptive!
Great! Thanks for taking the time to review!
Considering we are fixing #2105 with this PR, it would be great to add some tests for type aliases.
Considering we are fixing #2105 with this PR, it would be great to add some tests for type aliases.
Is there any specific place I should add them? Or should I create a new snapshot test file?