naga Implement module-level scoping

Try 2 :)

Fixes #2072 and #1745.

This PR separates parsing from lowering, by generating an AST, which desugars as much as possible down to something like Naga IR.

The AST is then used to resolve identifiers while lowering to Naga IR.

Oct 04 '22 19:10 SparkyPotato

@jimblandy Would be great if you could a look at the AST definition!

Also, you were talking about how 2 passes would be all that was necessary. However, lowering to Naga IR requires that the dependencies of each module-level declaration are known beforehand, as we need to (reverse) topologically sort the declaration list so that all dependencies are processed before their dependents.

The only way I can think of doing this in two passes is to maintain a set of the raw identifiers referenced by each declaration during parsing, and then using those to sort the declarations, but I'm not quite sure if this will actually be faster than resolving on the same IR, and then sorting.

Oct 04 '22 19:10 SparkyPotato

Here's how I was thinking it could work (I haven't actually tried this yet):

For the first pass, we parse mostly as we do now, except that identifiers in expressions, types, statements, etc. are represented by placeholders that just store the name. Except for the placeholders, we try to produce something as close to the crate IR as possible immediately.

The placeholders mean that we will have types that look very much like crate::Expression, etc. except that certain variants that assume knowledge of definitions (LocalVariable, obviously, but also things like AccessIndex, since we don't know struct type definitions) hold identifiers instead of handles. Some cases will need to change in more profound ways; for example, we can't tell whether S(x) is a Call or a Compose until we know the definition of S. But as much as possible, the unresolved types should resemble the crate types, so that people familiar with the rest of Naga can make assumptions.
For the resolution pass, starting with the entry points, start building true crate::{Function, EntryPoint, Type, ...} values. When we hit a placeholder, check to see if we already have a previously-resolved definition for that name. If so, use it. Otherwise, immediately recurse on resolving that name (which adds the resulting fully-resolved definition to the Module under construction), and then resume processing the placeholder, whose definition we now know. When we're done, add our completed definition to the Module.

This amounts to a depth-first traversal of the DAG of definitions, where we "mark nodes as visited" by adding resolved definitions to the Module. This ensures that definitions appear in the Module in dependency order, in the process of replacing placeholders with Handles.

After we've processed the entry points, we may want to just walk the top-level unresolved definitions and finish off anything unvisited. Such definitions could be safely dropped from the module, since they're unreachable from any entry point (and we should probably produce warnings about this), but it's very useful for testing to have everything in the input appear in the Module. Maybe this step would be optional.

Naturally, we will need to check for cyclic references at the point of recursion.

As I say, I haven't tried this, so there may be aspects of the problem I've missed - but it seems like something along these lines ought to be possible somehow.

Oct 04 '22 21:10 jimblandy

You should use Arena, UniqueArena, and Handle for the unresolved representation. That is one of the tricks Naga uses for performance: broadly speaking, caches and prefetchers are much happier with arrays than heap allocation, especially if you can phrase some computation as an iteration over the array.

Otherwise, try to make the change as undisruptive and limited as you can while still preserving legibility.

Oct 04 '22 23:10 jimblandy

Recursion was something I thought of, but processing each declaration does use quite a large amount of stack, so we might end up stack overflowing quite quick, especially on Windows, where the default stack size is 2 MB.

We can probably use stacker, which is what rustc uses.

About the warnings, I do plan to add error recovery to the frontend so we can output more than one error, and will add warnings for unused items there (I did have this implemented in my other frontend, it isn't too big of a change).

Oct 05 '22 07:10 SparkyPotato

Using arenas instead of trees for expressions should reduce the stack depth, though, because you can just iterate over the arena directly. And with a little care we could avoid recursing on statements at all - I think only Call and Atomic need resolution. So the depth would only be determined by the number of references we're in the midst of resolving, and not so much by the interior structure of the functions themselves.

Oct 05 '22 18:10 jimblandy

stacker sounds 1) amazing and 2) wildly unsafe and definitely not something we want to ship in a browser. So that's one easy solution out the window. I think stack depth is a legitimate concern here.

Oct 05 '22 18:10 jimblandy

I think keeping a set of (textual) dependencies used by each declaration during parsing, and then using that to post-order depth-first sort would be the easiest solution, and it shouldn't really impact performance too much.

The post-order sorting with cycle detection does require recursion, however it should require much less stack space than resolving the function.

I have also implemented parsing and finalized the AST structures, would be great if you could take a look, and tell me if any changes are needed :) (I have removed parts of the code that aren't necessary for parsing, will modify and re-add them later)

Oct 05 '22 19:10 SparkyPotato

I think keeping a set of (textual) dependencies used by each declaration during parsing, and then using that to post-order depth-first sort would be the easiest solution, and it shouldn't really impact performance too much.

If I understand what you mean here, this seems reasonable, too: basically, just treat every identifier use as an edge in a DAG constructed during parsing. It is definitely preferable to avoid the recursion. I'll take a look soon.

Oct 05 '22 21:10 jimblandy

I did have to keep track of locals during the initial parse phase, to fix an issue with code like this:

fn a() {
    let a = b();
}

fn b() -> i32 {
    let a = 20;
    return a;
}

If we just kept a set of textual dependencies, fn a being shadowed by let a would create a cyclic dependency error.

Oct 22 '22 10:10 SparkyPotato

I'm curious if the ast generated here is publicly accessible. I do have a case where I need to look at the shader's ast to figure out how to setup bind groups

Oct 28 '22 04:10 i509VCB

I'm curious if the ast generated here is publicly accessible. I do have a case where I need to look at the shader's ast to figure out how to setup bind groups

It's not public at the moment, but I can make it so.

However, can you not just use the Module that the front end produces instead? It will probably have more information than the semi-AST IR used here.

Oct 28 '22 13:10 SparkyPotato

@jimblandy This has reached feature parity with the existing frontend, and passes all the functionality-based tests. The diagnostics snapshots don't pass yet, but fixing those shouldn't change too much. You can start reviewing now.

Surprisingly, performance has actually improved. The current frontend consistently takes around 3.6 ms for the front/wgsl benchmark, while this does it in around 2.9 ms (a 20% decrease).

The only difference in output is that I don't initialize LocalVariables with constants since I'm not too sure if that added complexity is actually worth it. I can add it back if it's something wanted, though.

Nov 05 '22 09:11 SparkyPotato

Hey @SparkyPotato, thanks for this PR!

There seem to be quite a lot of copy-paste "changes" (code being moved around) that make it difficult to know what actually changed. Would it be possible for you to undo as many code moves as possible to make it easier to review this?

Right now, we are looking at having to review 11943 LOC (without test snapshots) which I don't consider particularly "reviewable".

Nov 14 '22 16:11 teoxoy

@teoxoy I've moved as much as I think possible back (parsing + errors).

construction.rs hasn't been moved because I have made quite a few changes to it, and it also needs to be in the lower module for privacy reasons.

This brings the overall diff of src/front/wgsl to +5038 -3348.

Do tell me if there's anything else I need to do to make reviewing easier!

Nov 14 '22 17:11 SparkyPotato

Thanks for reducing the diff, but it still looks like quite a lot of changes which will make browsing the history of the files challenging.

Let's move the code in lower/mod.rs to mod.rs and lower/construction.rs to construction.rs (with existing code where it previously was).

After this, the changes should be reviewable and after the review we can see how to proceed with new files/folders (and moving code around).

Nov 14 '22 18:11 teoxoy

Got it down to +4495 -2811, not sure if anything more is possible.

Nov 15 '22 11:11 SparkyPotato

Only user defined structs should have a crate::Type.name (some of the backends now generate extra type aliases for built-in types).

Wouldn't this lead to worse validation errors, where something like Vec { kind: Float, width: 4 } would be used instead of just vec4<f32>?

Nov 16 '22 11:11 SparkyPotato

Wouldn't this lead to worse validation errors, where something like Vec { kind: Float, width: 4 } would be used instead of just vec4<f32>?

I think for errors we can have a method that gives us the proper name. Also, considering we have other front-ends that don't use the same syntax (i.e. vec4<f32>) for types, we should avoid passing those through IR.

Nov 16 '22 12:11 teoxoy

I think for errors we can have a method that gives us the proper name. Also, considering we have other front-ends that don't use the same syntax (i.e. vec4<f32>) for types, we should avoid passing those through IR.

Done!

Nov 16 '22 12:11 SparkyPotato

Thanks! There seem to be a few more places where we still set the name:

5 in Lowerer.construct
1 in ExpressionContext.resolve_type

Nov 16 '22 13:11 teoxoy

@teoxoy I have done a few things:

Added Typifier::get_handle and used that in ExpressionContext. The GLSL frontend has not been edited.
Moved ensure_type_exists to OutputContext, and used that everywhere.
Created a sort of workaround for two-stage borrows to avoid the need to clone TypeInner - this does lead to code duplication, so let me know if you can think of a better solution.

Nov 16 '22 14:11 SparkyPotato

Added Typifier::get_handle and used that in ExpressionContext. The GLSL frontend has not been edited.

Feel free to change the GLSL frontend as well.

Moved ensure_type_exists to OutputContext, and used that everywhere.

We can make use of ensure_type_exists in a few more places:

2 in ExpressionContext.create_zero_value_constant
1 in Lowerer.construct

Created a sort of workaround for two-stage borrows to avoid the need to clone TypeInner - this does lead to code duplication, so let me know if you can think of a better solution.

Can't think of doing it any other way right now since we match on those TypeInners but will keep it in mind.

Thanks for the updates!

Nov 16 '22 14:11 teoxoy

The only difference in output is that I don't initialize LocalVariables with constants since I'm not too sure if that added complexity is actually worth it. I can add it back if it's something wanted, though.

from https://github.com/gfx-rs/naga/pull/2075#issuecomment-1304453265

I also noticed this by looking at the test snapshots. I think it should be fine for now considering we'll have to deal with const expressions differently soon.

Nov 16 '22 14:11 teoxoy

All changes done.

Nov 16 '22 15:11 SparkyPotato

Updated perf numbers: 10-15% slower (which is quite the improvement from my last measurement of 30%). Most likely attributed to getting rid of those TypeInner.clone()s and not creating Strings for all Type.names.

Nov 16 '22 15:11 teoxoy

Just a small note: it doesn't seem like GitHub will automatically close all the 10 referenced issues, so they're probably gonna have to be closed manually.

Nov 16 '22 15:11 SparkyPotato

I'll do one last round of review tomorrow and take a look at all those issues as well. Will probably edit the top comment so that they will all get automatically closed.

Considering the size and scope of the PR, we have a lot to go through. Thanks for being so receptive!

Nov 16 '22 16:11 teoxoy

Great! Thanks for taking the time to review!

Nov 16 '22 16:11 SparkyPotato

Considering we are fixing #2105 with this PR, it would be great to add some tests for type aliases.

Nov 17 '22 12:11 teoxoy

Considering we are fixing #2105 with this PR, it would be great to add some tests for type aliases.

Is there any specific place I should add them? Or should I create a new snapshot test file?

Nov 17 '22 12:11 SparkyPotato

naga naga copied to clipboard

Implement module-level scoping

naga
naga copied to clipboard