Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[BUG] ModuleNotFoundError: No module named 'megatron.training.tokenizer'; 'megatron.training' is not a package

Open hellangleZ opened this issue 10 months ago • 4 comments

Describe the bug A clear and concise description of what the bug is.

Stonge issue

/aml2/ds) root@A100:/aml2/Megatron-LM# from megatron.training.tokenizer import build_tokenizer from: can't read /var/mail/megatron.training.tokenizer (/aml2/ds) root@A100:/aml2/Megatron-LM# python tools/preprocess_data.py \

   --input /aml2/traindata/oscar-1GB.jsonl \
   --output-prefix /aml2/traindata\
   --tokenizer-type Llama2Tokenizer \
   --tokenizer-model /aml2/llama2/tokenizer.model \
   --workers 16 \
   --append-eod

[2024-04-02 08:03:42,280] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "/aml2/Megatron-LM/tools/preprocess_data.py", line 23, in from megatron.training.tokenizer import build_tokenizer ModuleNotFoundError: No module named 'megatron.training.tokenizer'; 'megatron.training' is not a package

To Reproduce Steps to reproduce the behavior. The easier it is to reproduce the faster it will get maintainer attention.

Expected behavior A clear and concise description of what you expected to happen.

Stack trace/logs If applicable, add the stack trace or logs from the time of the error.

Environment (please complete the following information):

  • Megatron-LM commit ID the latest
  • PyTorch version 2.2.1
  • CUDA version 12.1

Proposed fix If you have a proposal for how to fix the issue state it here or link to a PR.

Additional context Add any other context about the problem here.

hellangleZ avatar Apr 02 '24 08:04 hellangleZ

Have you tried again with the most recent version of main? There was a fix regarding this.

philipp-fischer avatar Apr 04 '24 15:04 philipp-fischer

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Jun 03 '24 18:06 github-actions[bot]

I used the latest release version v0.7.0 and the error still happend.

zhaoyang-star avatar Jul 06 '24 09:07 zhaoyang-star

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Sep 04 '24 18:09 github-actions[bot]