Lukasz Stafiniak issues

Results 191 issues of


                                            Lukasz Stafiniak

Rename routine / kernel parameters from `param` / `params` to `kparam` / `kparams` across the codebase

To avoid confusion with `Tensor.params`.

Implement common subexpression elimination after inlining

Inlining often leads to redundant repeated computations, sometimes lots of them, making common subexpression elimination much more valuable than it would be on its own.

Implement loop hoisting (loop-invariant code motion), prior to vistit counting (the semi-concrete interpretation run)

This will allow more tensor nodes to become virtual.

Implement a Universal Pool Allocator across backends

The Metal backend forces us to do this, but it's good for us! `buffer_ptr` becomes `buffer_offset`. With this significant refactoring, we can also decide to rename things to maybe something...

Simplify translations from the `%cd` syntax

At least, remove rendundant wrapping and unwrapping with `Some`. This is aiming at readability of the translations should anyone check them out for educational or debugging purposes.

Resolve non-determinism of multicore_cc and restore it as the primary testing target

I'm postponing working on this. There's non-determinism even when using a single stream with multicore_cc, but only in the test/training/bigram.ml example -- and non-determinism in moons_demo_parallel but that's more opportunities...

bug

Track whether the variable generated for Local_scope needs initialization (i.e. if it's used in its assignment value computation)

Add a field to Local_scope to track this and populate it from the `recurrent` field of traced_array IIRC. So this should be easy.

Deep dive into the Megakernel approach since it's well aligned with OCANNL design

They implement an interpreter on the GPU, maybe we can avoid that yet still use their solutions for within-kernel synchronization. Or maybe we can go the interpreter route, to be...

explore

important

Use private mode in Metal except when tnode mode is hosted-sharing-cross-streams

enhancement

The MCP frontend to the TUI should print the synthetic entry IDs and MCP documentation should underscore combining commands

The MCP doc should underscore the more efficient "goto ; expand" compared to arrow and enter key emulating commands.