Pattern matching API / DSL for rewrite rules
If we decide to apply simple pattern based rewrites, we'll need a proper matching API since C#'s is not very scalable.
An API like LLVM's would be nice and it's possible but it may come at a cost, depending on how it is implemented:
- Relying on generics may bloat the binary and runtime structures with useless generic instantiations
- Relying on plain interfaces may be slow because JIT sucks at devirtualizing even in trivial cases
It may be worth considering an extension-based approach instead, see https://www.reddit.com/r/csharp/comments/83ep10/comment/dvhdjss
Other:
- https://cfallin.org/blog/2023/01/20/cranelift-isle/
- https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/README.md The Go compiler has a nice DSL for rewrite rules, which we may also want to copy from :')
I love the go variant (e.g. https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/_gen/generic.rules)
What about making a source generator that generates one pass for each rules definition file?
We could also use Scheme, because it's much easier to parse and compile it also to a custom pass for each file
Yeah the go ruleset syntax looks pretty nice, if we have a bunch of rule files then source generators might just be the best way to go about it.
If the rulesets allow calling C# code directly, having a Match() style API would probably be redundant.
But we should also allow matching and replacing without a source generator maybe something like this:
block.Match("(Add (ConstInt x) (ConstInt y))") -> returns the matched instructions
block.MatchAndReplace("(Add (ConstInt [x]) (ConstInt [y])) -> (ConstInt [x + y])") -> replaces directly the instructions
I am going to implement this, if you agree
I had something like that in mind as well, I was thinking of maybe using string interpolation handlers + source interceptors to capture the values, something like: Match(inst, $"Add (Mul _, {out ConstInt x}), {out ConstInt y}") - although i'm not sure how this would work with replacements
(I actually had an working prototype using handlers this way but I don't think I have it anymore, and iirc the argument holes don't actually allow out qualifiers, the variables were taken by in and cast to refs, which might be less than ideal)
I would be happy to take any implementation for this feature though!
I'm not sure how your original proposal would return the values, but I quickly sketched something with the interpolators and it seems like they could work reasonably well.
I imagine the source generator could just capture the entire string at once to generate/intercept with an specialized matcher, so an actual runtime implementation would not be needed.
void Foo(Instruction inst)
{
Value? x = null, y = null;
if (Match(inst, $"add (ConstInt {x}, {y})")) {
}
}
public static bool Match(Value value, ValueMatchInterpolator pattern) { return false; }
[InterpolatedStringHandler, EditorBrowsable(EditorBrowsableState.Never)]
public ref struct ValueMatchInterpolator
{
public ValueMatchInterpolator(int literalLength, int formattedCount) { }
public void AppendLiteral(string value) { }
public void AppendFormatted<T>(in T value) {
Unsafe.AsRef(in value);
}
}
I guess even something like if (Match(inst, out ConstInt x, out Value y, "add ([x], [y])")) could work, but idk, that's just a few ideas.
https://sharplab.io/#v2:C4LgTgrgdgNAJiA1AHwAICYCMBYAUKgBgAJVMA6AJWmAEsBbAUzIGEB7OgBxoBsGwBlPgDcaAYwYBnANx48qAMxEwDUazBwiANQCG3CAwAUAShn5Fy1eqIBJKBOCRRtVlGNEQW3fuOmFSlWoabHbAtsAGNFDAREJeDEbunnoMeADeeESZJIqR0TrJRAC8MXGmAL6yZiToRMxE6bhZJJgAbCQALEQAstqRxhlZDU1NsWBEkfZFRFAMAO42IY7OriYDw7Uu9mEA/EQAHlNQENzcMEQAnofH3KbrmWvDNABmRAY9wKIAFhEhZwAkACJtHANAZUnsymdUucykYAUYEkM7k1SABOAx7Mj5fSrRrIrJogznLFxXHIip4zIUh5+UhtABGrFY3G62g+32xDBKyTOnPeXzCfA4zLZaiIHDZwD4UERRAeTXlWUVmVQAHYiA59FI5ZSiNTdQBtQVgYXcNkMOD8ByRADmAAltFA4LwwABdGnmBgvexLJL6fmfY2m0VgB5I9Z+PlsgVRIUi4BqH7Rbg0KVgXQAGQYUBtwE+Z1yRCeajokotbGosv1yNpnQAghwONm4BnU3xdAZSMRYskq8rsh0iA2m06AGIlstwAA8ABUAHw/Igz7k4nX4rKzT58LnLjyc/vh5EAVTs2ieTDrEgoXoMPdXxQMM6MBk5Rhm82CWyiBgArGS7hSTQUvqQA=
this seems to work. This is a great Idea. I will implement it
May it would be easier to discuss things if we connect on discord or similar
I'm also dubiousconst282 on discord