Zhenting Qi

Results 9 issues of Zhenting Qi

### Question Hi! I use Llava13b for a domain-specific task and want to fine-tune it on an image-text dataset. I wonder: 1. Do I need to do any more necessary...

**Describe the bug** I am running InstructRetro and start with data preprocessing, with `bash tools/retro/examples/preprocess_data.sh db-build` **Stack trace/logs** Due to torchrun's multiprocessing, the output stack trace is messy. I manually...

**Describe the bug** I am running [step 3](https://github.com/NVIDIA/Megatron-LM/blob/InstructRetro/tools/retro/build_db.md#step-3-build-index-for-similarity-search) on one 80G A100 GPU to "Build index for similarity search". My "DATA_BLEND" is the first 10000 scraped text items from openwebtext...

stale

Hi, so I was training 345m GPT2 using your example scripts `examples/pretrain_gpt.sh`. The validation loss and PPL, however, keep going up, while the training loss decreases as expected. ![image](https://github.com/NVIDIA/Megatron-LM/assets/140472590/ab8fb941-d9d0-4def-b7cf-71659f5bf6af) My...

Hi! Can anyone please tell me how to run the full mining pipeline using cc_net on just a very small portion of CC? E.g., I just want to around 100M...

Hi! I am trying to download the crawl split 2023-50. I am running the command `python -m cc_net --dump 2023-50`, which raises the following error: ``` Will run cc_net.mine.main with...

Hi! This is an amazing reproduction of DAG-GNN. Just a few questions: could you please help me make sense of the input data? i.e., what does each column/row in the...

**Describe the bug** I am running data preprocessing script using the following command: ``` python tools/preprocess_data.py \ --input ./openwebtext/scraped_100/train_data.json \ --output-prefix ./openwebtext/scraped_100/my_gpt2 \ --vocab-file ./big_models/megatron-gpt-345m/gpt2-vocab.json \ --dataset-impl mmap \ --tokenizer-type...

stale

Hi! Any plan on releasing your InstructS2S-200K? Thanks!