Sanjarbek Rakhmonov

Results 3 comments of Sanjarbek Rakhmonov

It is fairly easy to convert the weights to the format that will work will llama.cpp. Just do the exact opposite of what this script does: [https://github.com/Lightning-AI/lit-llama/blob/main/scripts/convert_checkpoint.py](https://github.com/Lightning-AI/lit-llama/blob/main/scripts/convert_checkpoint.py) For my case,...

```python import gc import torch from pathlib import Path from typing import Dict def reverse_convert_state_dict(state_dict: Dict[str, torch.Tensor], dtype: torch.dtype = torch.float32) -> Dict[str, torch.Tensor]: reversed_dict = {} reversed_dict["tok_embeddings.weight"] = state_dict["transformer.wte.weight"].to(dtype)...

It depends on your specific use case. I noticed that it is better to mask when your dataset is long chat dialogues. When not masked, model responses become quite repetitive...