Rasmus Toivanen comments

Results 53 comments of


                                            Rasmus Toivanen

Fine-tuning on single GPU

Ok I am now back at trying this. I downloaded the Finnish Fleurs data with the provided download scripts and created the manifest files. Now I am trying to finetune...

Fine-tuning on single GPU

Any better example on how to do finetuning? Do you have a test that the text to speech finetuning should work? I tried finetuning with Fleurs data and it threw...

Support Finnish

I could take on the task of translating HellaSwag samples for Reasoning. I have some translation credits left for this month in Deepl suscription so that could be used

Unable to Pull Model Manifest - "Get https://registry.ollama.ai/v2/library/llama3/manifests/latest: EOF"

I am having similar issues: ![image](https://github.com/user-attachments/assets/098462a2-3eec-4aa6-8b9c-52644a71929d) ![image](https://github.com/user-attachments/assets/a3ee5972-832a-4a44-8970-991a115ee861) GGUF files from here: https://huggingface.co/mradermacher/Ahma-3B-Instruct-GGUF/tree/main Original model: https://huggingface.co/Finnish-NLP/Ahma-3B-Instruct

converting Gemma maxtext compatible checkpoint to Hugging Face format

From here you should find conversion script for gemma 2 https://github.com/AI-Hypercomputer/maxtext/issues/1324

Trainer: add predict with generate

Hopefully this get's merged soon. I thought it would be easy to implement some custom_metrics to calculate like mtbench scores at every eval step as previously something like bleu/wer were...

Training works, Validation Fails OOM (With Reproduction Notebook)

What is your eval_batch_size? If not defined huggingface probably still defaults to 8

CUDA out of memory Error

I've had also issues. Everything seemed to work fine with unsloth 2025.10.1. Now I am doing similar training with Gemma3 (Same bs, ga, precision etch) but memory is not stable...

CUDA out of memory Error

I have been able to work on 2025.10.1. I will try the latest release again later

Getting CUDA OOM on training gemma-2-2b with "lm_head" and "embed_token" target projects.

Have you tried reducing rank. That has small impact but still might be worth trying