Arnav Chavan
Arnav Chavan
The variables not used in the graph are not supported and gives a runtime error - RuntimeError: One of the differentiated Tensors appears to not have been used in the...
Can you share the model architecture for miniimagnet used in iMAML, since it varies slightly from paper to paper.
Hello Team, Great work! I wanted to know whether https://github.com/Lightning-AI/lm-evaluation-harness/tree/lit-llama works or not since it is recently updated. If not is there any way I can evaluate lit-llama models on...
Hello and thanks for your work! While running bradley-terry-rm/llama3_rm.py the final saved model does not have a lm head as the script is using a AutoModelForSequenceClassification model and not CausalLM....
### Issue: Implementing Iterative DPO on Phi3-4k-instruct Hi, thanks for the great work and open source! I am trying to implement iterative DPO on `Phi3-4k-instruct`. The following outlines my approach:...
Great work and thanks for the codebase! I want to know the exact detailed of LoRA fine-tuning as mentioned in Table 6 of the main paper. Also if you could...