ruff icon indicating copy to clipboard operation
ruff copied to clipboard

Refactor the `ExprDict` node

Open AlexWaygood opened this issue 1 year ago • 1 comments
trafficstars

Summary

This PR refactors the ExprDict node in crates/ruff_python_ast/nodes.rs from this:

struct ExprDict {
    keys: Vec<Option<Expr>>,
    values: Vec<Expr>,
}

to this:

struct DictItem {
    key: Option<Expr>,
    value: Expr,
}

struct ExprDict {
    items: Vec<DictItem>,
}

This has several advantages:

  • In most cases where we're iterating over the elts for an ExprDict node, we zip the keys and values together to iterate over them both simultaneously. With the new representation of the AST structure, using zip is no longer necessary; the key and value are naturally paired together, and iteration over the two together can be done in a much more elegant way.
  • All of our rules that do currently iterate over the keys and values zipped together implicitly rely on the invariant that the keys and values vectors on an ExprDict node will be of exactly the same length. This refactoring encodes this invariant into the type itself, making it explicit.
  • The size of an ExprDict node is reduced from 56 to 32 bytes

In addition to the above refactoring, I added two convenience methods to ExprDict so that we can still easily iterate over and index into the keys and values of a dictionary: keys() and values(). Both return read-only "view" objects that allow you to iterate over -- and also index directly into -- an ExprDict node's keys or values. Some of these implementations might be slightly over the top right now. Let me know what you think!

Test Plan

cargo test / `cargo insta review

AlexWaygood avatar May 03 '24 17:05 AlexWaygood

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

github-actions[bot] avatar May 03 '24 18:05 github-actions[bot]

I really like the change. Having both key and value in a single item makes the API way more convenient to use. Thanks for doing this refactor.

MichaReiser avatar May 06 '24 07:05 MichaReiser

The evaluation order at runtime is key, value, key, value, ... in source order, matching the new order used in this PR.

This is perhaps simplest to observe by de-compiling some Python code using dis:

>>> import dis
>>> def f():
...     return {a: b, c: d}
...
>>> dis.dis(f)
  1           RESUME                   0

  2           LOAD_GLOBAL              0 (a)
              LOAD_GLOBAL              2 (b)
              LOAD_GLOBAL              4 (c)
              LOAD_GLOBAL              6 (d)
              BUILD_MAP                2
              RETURN_VALUE

carljm avatar May 06 '24 13:05 carljm

The evaluation order at runtime is key, value, key, value, ... in source order, matching the new order used in this PR.

This is perhaps simplest to observe by de-compiling some Python code using dis:

But unfortunately, it appears that the evaluation order for mapping patterns in case statements may be different, so I might have to tweak some of the changes I just pushed in https://github.com/astral-sh/ruff/pull/11267/commits/3ce0f6a96f0dd2a1c3d8cb8a452a4d5b7c5ad08a:

>>> def f():
...     match config:
...         case {'a': 'b', 'c': 'd'}:
...             pass
... 
>>> dis.dis(f)
  1           0 RESUME                   0

  2           2 LOAD_GLOBAL              0 (config)

  3          12 MATCH_MAPPING
             14 POP_JUMP_IF_FALSE       24 (to 64)
             16 GET_LEN
             18 LOAD_CONST               1 (2)
             20 COMPARE_OP              92 (>=)
             24 POP_JUMP_IF_FALSE       19 (to 64)
             26 LOAD_CONST               4 (('a', 'c'))
             28 MATCH_KEYS
             30 COPY                     1
             32 POP_JUMP_IF_NONE        13 (to 60)
             34 UNPACK_SEQUENCE          2
             38 LOAD_CONST               2 ('b')
             40 COMPARE_OP              40 (==)
             44 POP_JUMP_IF_FALSE        7 (to 60)
             46 LOAD_CONST               3 ('d')
             48 COMPARE_OP              40 (==)
             52 POP_JUMP_IF_FALSE        4 (to 62)
             54 POP_TOP
             56 POP_TOP

  4          58 RETURN_CONST             0 (None)

  3     >>   60 POP_TOP
        >>   62 POP_TOP
        >>   64 POP_TOP
             66 RETURN_CONST             0 (None)

(Side note: wow that's a lot of bytecode instructions!)

I'm not entirely sure if there's a situation where it would matter for a mapping pattern, but perhaps we should err on the side of caution here and match the Python runtime as far as possible anyway.

AlexWaygood avatar May 06 '24 14:05 AlexWaygood

I've taken the changes to the pattern-matching nodes back out again, as @dhruvmanila pointed out to me that they might conflict with some changes around that node being planned in #10570 (thanks!).

I wonder if you thought about an alternative design which would utilize an enum like:

pub enum DictItem {
	KeyValue(KeyValueItem),
	DoubleStarred(Expr), // or Expanded, Spread, ...
}

I have a working prototype of this locally, and I think it's possibly an improvement. But I'm also happy to leave it for now.

The PR should be ready for re-review!

AlexWaygood avatar May 06 '24 16:05 AlexWaygood

I've taken the changes to the pattern-matching nodes back out again, as @dhruvmanila pointed out to me that they might conflict with some changes around that node being planned in https://github.com/astral-sh/ruff/issues/10570 (thanks!).

Do we know how the pattern layout will change? Are the changes compatible with the changes we made here or will we end up with a large inconsistency between the two nodes?

MichaReiser avatar May 07 '24 06:05 MichaReiser

Do we know how the pattern layout will change? Are the changes compatible with the changes we made here or will we end up with a large inconsistency between the two nodes?

Not sure. I'll wait for @dhruvmanila to chime in before merging (I know he's out for much of today)

AlexWaygood avatar May 07 '24 06:05 AlexWaygood

Thanks both!

AlexWaygood avatar May 07 '24 11:05 AlexWaygood