Albert Zeyer
Albert Zeyer
Why is this still open?
> don't use sequence tags at all for initialization / sequence order This doesn't really work for `MetaDataset` because we sometimes (even often?) have datasets where the underlying seq order...
> I ended up splitting the dataset up into smaller chunks and loading them with DistributeFilesDataset This is just a workaround, not really a solution to the problem itself, i.e....
I also have this issue now: ``` $ py-spy dump -p 16520 Process 16520: /home/az668407/work/py-envs/py3.12-torch2.5/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=8, pipe_handle=82) --multiprocessing-fork Python v3.12.8 (/usr/bin/python3.12) Thread 16520 (active): "MainThread"...
What specific synchronization schemes do you have in mind? While this API looks flexible, I think it's actually not very flexible and has lots of implicit assumptions: * It assumes...
> I was thinking of a scheme where you do parameter averaging after different steps depending on whether you are averaging within one node or across nodes. Ah yea, that's...
> E.g. on the first invocation of the function? Yes that should work just fine, right?
Some update: In the future, I want to implement sth similar as this: https://github.com/PrimeIntellect-ai/prime (or maybe just reuse the existing code there). Specifically, this includes `ElasticDeviceMesh` and [OpenDiLoCo](https://www.primeintellect.ai/blog/opendiloco)/[DiLoCo](https://arxiv.org/abs/2311.08105). This is...
Downgrading to ctc_segmentation-1.6.6 should fix this (that's still compatible with Numpy 1). I think ctc_segmentation-1.7.4 which is used here needs Numpy 2.
This seems to be an issue in pynput? Can you report that [here](https://github.com/moses-palmer/pynput/issues) and link it to this issue as well?