Suraj Subramanian comments

Results 44 comments of


                                            Suraj Subramanian

Running PyTorch produces a "failed to create process"

We can't diagnose the issue without more information about the platform you're running on and the full error you encounter. Also, please use the issue template in the future as...

An error occurred while running llama-2-7b

I'm not sure what the error is, please paste the full stacktrace. If you made any modifications to the script, include the changes you made. Also, please adhere to the...

Vertical lines on token embeddings visualization

Sounds interesting, but I'm not sure if embeddings can be meaningfully visualized like this. Perhaps approaches like t-sne/umap might provide more insight? cc @melanierk

torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: -9) local_rank: 0 (pid: 760)

Please share the full stacktrace which contains the actual error.

model weights dtype change in Llama.build

CUDA supports float16 which is more efficient. See L:118 where this is set as the default dtype. You can comment that out to load the model as bf16 if you'd...

Update examples to give clear error messages

Thanks @aakashapoorv I think it might be better to add the asserts in the `build` function instead of in the example scripts. https://github.com/meta-llama/llama3/blob/cc44ca2e1c269f0e56e6926d7f4837c983c060dc/llama/generation.py#L37

Running from a notebook fails when trying to setup torch.distributed

Hi! The example scripts in this repo are for running inference on single (for 8B) and multi (for 70B) GPU setups using CUDA, but Windows is not currently supported. You...

Slight changes to `MODEL_CARD.md` to organize information

Thanks for your contribution @pchng!

Can I use the transformers.AutoTokenizer to load the tokenizer?

Yes, you can use AutoTokenizer.from_pretrained('meta-llama/Meta-Llama-3-8B-Instruct)

Issue with 70B instruct

How many GPUs are you using? the 70B model will need 8GPUs to run from this repo. If you have less than 8 GPUs, please use the model from HF