mesozoic-egg
mesozoic-egg
MTLResource newBufferWithBytesNoCopy:length:options:deallocator: failed to create on mac os runner
### Description On Mac OS runner, `newBufferWithBytesNoCopy:length:options:deallocator:` create an instance with length zero and accesing elements causes a segmentation fault ```c++ id device_buffer = [device newBufferWithBytesNoCopy:ns_mutable_ptr length:length options:0 deallocator:nil ];...
FSDP
FSDP works out of the box (mostly) with the MultiLazyBuffer code, ~~with a minor tweak to make optimizer shard on the correct axis.~~ ~~See `beautiful_mnist_multigpu.py` for the example. It shards...
Just demonstrating the idea with hard coded stuff: 1. kernel splits an AST, allocate buffer 2. get_runners reinitialize a kernel for each splitted AST, and insert the buffer accordingly when...