Lu Dai

Results 8 comments of Lu Dai

how can it train without padding? I thought padding is necessary to collate sentences of different lengths into the same batch?

@chrisociepa I see, do you mean during llm pretraining, the sequences are often chunked into equal-length pieces, unlike at inference where one sequence occupies one dimension in the input batch?

@chrisociepa @ElleLeonne Thank you very much for sharing!

@Nsigma-Bill Thanks for sharing your opinion, but actually the above answers solved my question. In the link, the last remaining tail are thrown away if it is shorter than the...

Hi, I used 2 a100 80g gpu for finetuning, and before the crash(when they hang with 100% utility) the memory consumption is less than half of the full gpu memory,...

@lywinged I thought deepspeed is complementary to ddp, rather than a counterpart? I think I still need multi-gpu acceleration.

我遇到了一样的问题,发现是host+api配置的问题。在front_end/.env.production里VITE_APP_API_HOST改成http://localhost:8777/api,.env里user_ip改成localhost, 然后run_for_local_option.sh里也改对应的部分就好了

Same situation here. cannot correctly edit a remote file under agent mode if GitHub.copilot-chat are set to both "workspace" and "ui", which is nevertheless necessary when the remote machine cannot...