Suraj Subramanian comments

Results 44 comments of


                                            Suraj Subramanian

example time_sequence_prediction does not produce expected output

@rallen10 Thanks for raising this. I agree, this example's learning rate is (way) too high and the exploding loss resolves by setting it to something like 0.001. I'll leave this...

add support for mock data input to imagenet example

@suiyuan2009 it's been a while since you posted this, but in the chance that you're still interested in this I can help you with raising a PR for this

Python: from llama2 import KnowledgeBase produces error

We don't have any module called "KnowledgeBase" in this project. Are you sure you're running the right code? Looks like you're pip installing the wrong package. Please follow the instructions...

Potential for Controversy in Generation

Thank you for pointing this out! Even though the tokenizer has multilingual vocabulary, currently Llama3 doesn't support multilingual inference. Currently the models are officially supported for inference in English, but...

Can pre-trained models be used in commercial applications?

Llama2 is permissible for commercial use with an important caveat: https://github.com/facebookresearch/llama/blob/main/LICENSE#L65

Create alf

Waiting on the ALF team to respond before we merge

Slight changes to `MODEL_CARD.md` to organize information

The change helps improve readability, lgtm

Will the training code be released?

We have shared scripts for finetuning and inference at https://github.com/facebookresearch/llama-recipes

Meta-Llama-3-70B-Instruct running out of memory on 8 A100-40GB

Please see this thread: https://github.com/meta-llama/llama3/issues/157#issuecomment-2110497041

Fine-tuning

We recently shared scripts for finetuning and inference at https://github.com/facebookresearch/llama-recipes