xiaojunjie
xiaojunjie
# Proposed changes Issue Number: close #15658 ## Problem summary 1. Organize http documents 2. Add http interface authentication for FE 3. Support https interface for FE 4. Provide authentication...
# Proposed changes Issue Number: close #15394 #15395 #15396 ## Problem summary The original process of NereidsPlan includes analysis, rewriting and optimization. This newly added cache works after rewriting, before...
I try to convert gpt checkpoint from **local** to **transformer_engine** according to following map ` { 'input_layernorm.': 'self_attention.linear_qkv.layer_norm_', 'pre_mlp_layernorm.': 'mlp.linear_fc1.layer_norm_', } ` It works well only when the optimizer is...
**Your question** Given a [BlendableDataset](https://github.com/NVIDIA/Megatron-LM/blob/0609f27fe8376f17ab65c001d3d8f35cd8175950/megatron/data/blendable_dataset.py#L15C43-L15C43) of dataset A and B with weights 1:1 After training N iterations,change weights to 2:1 and continue from saved checkpoint According to BlendableDataset, it will...