ruff
ruff copied to clipboard
Refactor the `ExprDict` node
Summary
This PR refactors the ExprDict node in crates/ruff_python_ast/nodes.rs from this:
struct ExprDict {
keys: Vec<Option<Expr>>,
values: Vec<Expr>,
}
to this:
struct DictItem {
key: Option<Expr>,
value: Expr,
}
struct ExprDict {
items: Vec<DictItem>,
}
This has several advantages:
- In most cases where we're iterating over the elts for an
ExprDictnode, wezipthe keys and values together to iterate over them both simultaneously. With the new representation of the AST structure, usingzipis no longer necessary; the key and value are naturally paired together, and iteration over the two together can be done in a much more elegant way. - All of our rules that do currently iterate over the keys and values zipped together implicitly rely on the invariant that the
keysandvaluesvectors on anExprDictnode will be of exactly the same length. This refactoring encodes this invariant into the type itself, making it explicit. - The size of an
ExprDictnode is reduced from 56 to 32 bytes
In addition to the above refactoring, I added two convenience methods to ExprDict so that we can still easily iterate over and index into the keys and values of a dictionary: keys() and values(). Both return read-only "view" objects that allow you to iterate over -- and also index directly into -- an ExprDict node's keys or values. Some of these implementations might be slightly over the top right now. Let me know what you think!
Test Plan
cargo test / `cargo insta review
ruff-ecosystem results
Linter (stable)
✅ ecosystem check detected no linter changes.
Linter (preview)
✅ ecosystem check detected no linter changes.
Formatter (stable)
✅ ecosystem check detected no format changes.
Formatter (preview)
✅ ecosystem check detected no format changes.
I really like the change. Having both key and value in a single item makes the API way more convenient to use. Thanks for doing this refactor.
The evaluation order at runtime is key, value, key, value, ... in source order, matching the new order used in this PR.
This is perhaps simplest to observe by de-compiling some Python code using dis:
>>> import dis
>>> def f():
... return {a: b, c: d}
...
>>> dis.dis(f)
1 RESUME 0
2 LOAD_GLOBAL 0 (a)
LOAD_GLOBAL 2 (b)
LOAD_GLOBAL 4 (c)
LOAD_GLOBAL 6 (d)
BUILD_MAP 2
RETURN_VALUE
The evaluation order at runtime is
key, value, key, value, ...in source order, matching the new order used in this PR.This is perhaps simplest to observe by de-compiling some Python code using
dis:
But unfortunately, it appears that the evaluation order for mapping patterns in case statements may be different, so I might have to tweak some of the changes I just pushed in https://github.com/astral-sh/ruff/pull/11267/commits/3ce0f6a96f0dd2a1c3d8cb8a452a4d5b7c5ad08a:
>>> def f():
... match config:
... case {'a': 'b', 'c': 'd'}:
... pass
...
>>> dis.dis(f)
1 0 RESUME 0
2 2 LOAD_GLOBAL 0 (config)
3 12 MATCH_MAPPING
14 POP_JUMP_IF_FALSE 24 (to 64)
16 GET_LEN
18 LOAD_CONST 1 (2)
20 COMPARE_OP 92 (>=)
24 POP_JUMP_IF_FALSE 19 (to 64)
26 LOAD_CONST 4 (('a', 'c'))
28 MATCH_KEYS
30 COPY 1
32 POP_JUMP_IF_NONE 13 (to 60)
34 UNPACK_SEQUENCE 2
38 LOAD_CONST 2 ('b')
40 COMPARE_OP 40 (==)
44 POP_JUMP_IF_FALSE 7 (to 60)
46 LOAD_CONST 3 ('d')
48 COMPARE_OP 40 (==)
52 POP_JUMP_IF_FALSE 4 (to 62)
54 POP_TOP
56 POP_TOP
4 58 RETURN_CONST 0 (None)
3 >> 60 POP_TOP
>> 62 POP_TOP
>> 64 POP_TOP
66 RETURN_CONST 0 (None)
(Side note: wow that's a lot of bytecode instructions!)
I'm not entirely sure if there's a situation where it would matter for a mapping pattern, but perhaps we should err on the side of caution here and match the Python runtime as far as possible anyway.
I've taken the changes to the pattern-matching nodes back out again, as @dhruvmanila pointed out to me that they might conflict with some changes around that node being planned in #10570 (thanks!).
I wonder if you thought about an alternative design which would utilize an enum like:
pub enum DictItem { KeyValue(KeyValueItem), DoubleStarred(Expr), // or Expanded, Spread, ... }
I have a working prototype of this locally, and I think it's possibly an improvement. But I'm also happy to leave it for now.
The PR should be ready for re-review!
I've taken the changes to the pattern-matching nodes back out again, as @dhruvmanila pointed out to me that they might conflict with some changes around that node being planned in https://github.com/astral-sh/ruff/issues/10570 (thanks!).
Do we know how the pattern layout will change? Are the changes compatible with the changes we made here or will we end up with a large inconsistency between the two nodes?
Do we know how the pattern layout will change? Are the changes compatible with the changes we made here or will we end up with a large inconsistency between the two nodes?
Not sure. I'll wait for @dhruvmanila to chime in before merging (I know he's out for much of today)
Thanks both!