Baibaifan

Results 26 comments of Baibaifan

> when you use --use-mcore-models,, you cannot use local. --use-flash-attn decides whether to use the OSS flash attention implmentation or cudnn implmementation. hi @ethanhe42 ,I understand the process you mentioned,...

> I fixed it, but the megatron has not responded👀 #762 It’s so awesome, I must give u a Turing Award.

@ethanhe42 Thank you for your answer. I have made some modifications to the loading scene. I understand that using `view` is to have better continuous memory usage efficiency. If `split`...

> As of today, `--external-cuda-graph` must go with `--te-rng-tracker`. I suspect your phase 3 error is still an OOM-caused strange behavior. Could you make some mini tests first such as...

> These are my arguments running 8*7b cudagraph. But I tested with 4 nodes: `--position-embedding-type rope --normalization RMSNorm --swiglu --no-position-embedding --no-masked-softmax-fusion --tokenizer-type Llama2Tokenizer --tokenizer-model xxxxx/mixtral-tokenizer.model --ffn-hidden-size 14336 --group-query-attention --num-query-groups 8...

> io_memory_reduction = True hi, @buptzyb , I used `io_memory_reduction = True`, megatron commit: 07101375c8a824cc1c4e61848f24f1ac4840b23b, te commit: 4c39e40fc00f2120a781a4892c6043f9e89c2033. TE: git clone https://github.com/buptzyb/TransformerEngine.git, and checkout `cudagraph_reuse` branch. NGC-25.05 and 8 *...

hi, @buptzyb, The` oom` problem has been solved. It was a problem with the TE installation. The performance of NGC-25.05 and 8 * H100 with `cuda-graph` turned on is not...

> What's the throughput when the moe balance loss is low enough? If you compare the throughput just at the beginning steps, the numbers may be unreasonable. For the same...

> oh, an easier way is to remove the two `make_weak_ref` calls in `graph.py`, like replacing the `per_callable_static_outputs[per_callable_bwd_idx] = make_weak_ref(static_outputs)` with `per_callable_static_outputs[per_callable_bwd_idx] = static_outputs`. This makes you fully get rid...

sudo ./insmod.sh ->sudo ../insmod.sh? @haswelliris