Casper comments

Results 295 comments of


                                            Casper

DeepSeek V3 Support

@tianyu-l I am mainly interested in a model architecture implementation. The remaining details like FP8 training and various forms of parallelism is already implemented in TorchTitan, which should be reused....

`agent_memory` list index out of range

@blankanswer @braisedpork1964 @lcolok I would really appreciate if you can try to reproduce and fix the error. I tried again today and there is no way around this error, it...

`agent_memory` list index out of range

Hi @braisedpork1964, I followed your instructions but I still get the same error after adding fewshot examples to InterpreterParser and PluginParser. Is it possible to figure out a permanent fix...

`agent_memory` list index out of range

@EmilyQian2001 No I didn't solve the problem. This bug is breaking the whole interaction, rendering MindSearch useless. I love the concept of MindSearch, but it's current state is that it...

Distributed Timeout during Dataset Tokenization

> Do you have an example of a public dataset that we can repro this on? Unfortunately I don't

Distributed Timeout during Dataset Tokenization

Launching preprocessing in distributed mode is the main problem. You can probably create a dummy dataset of 1 million samples with 64k tokens each and try, but I cannot for...

Distributed Timeout during Dataset Tokenization

Maybe the `axolotl preprocess` CLI should not launch with accelerate? What do you think @winglian?

Distributed Timeout during Dataset Tokenization

I used axolotl train, triggered the error, then pivoted to axolotl preprocess and found the same error. I will need to check the commands again, but I'm pretty sure I...

Distributed Timeout during Dataset Tokenization

This does the trick. Though, I would recommend using something else than `with zero_first(is_local_main_process())` in general. This lowers QoL when using axolotl and could be replaced with a simpler FileLock...

Distributed Timeout during Dataset Tokenization

> [@casper-hansen](https://github.com/casper-hansen) agreed, feel free to make a PR! Or, I'll probably do so later. I probably won't be creating the PR, but let's leave this issue open until a...