[Bug] os.environ["UNSLOTH_RETURN_LOGITS"] = "1" becomes unset to "0" once I start to train
Hello!
I have been working on fine tuning Gemma 3. During training, I wish to validate based on a custom metric. To mitigate the following error, I set os.environ["UNSLOTH_RETURN_LOGITS"] = "1":
TypeError: Unsupported types (<class 'unsloth_compiled_module_gemma3.EmptyLogits'>) passed to `_pad_across_processes`. Only nested list/tuple/dicts of objects that are valid for `is_torch_tensor` should be passed.
I am using the following configuration:
config = SFTConfig(
per_device_train_batch_size=self.train_args.get("batch_size", 4),
gradient_accumulation_steps=self.train_args.get("grad_accum", 8),
gradient_checkpointing=True,
gradient_checkpointing_kwargs={"use_reentrant": False},
max_grad_norm=0.3,
warmup_ratio=0.03,
learning_rate=self.train_args.get("lr", 2e-4),
logging_steps=10,
save_strategy="steps",
save_steps=10,
eval_strategy="steps",
eval_steps=self.train_args.get("eval_steps", 10),
load_best_model_at_end=self.train_args.get("load_best_model_at_end", True),
metric_for_best_model=self.train_args.get("metric_for_best_model", "top1_accuracy"),
greater_is_better=self.train_args.get("greater_is_better", True),
optim=self.train_args.get("optim", "adamw_torch_fused"),
weight_decay=0.01,
lr_scheduler_type="cosine",
seed=self.train_args.get("seed", 3407),
output_dir=self.output_dir,
report_to="tensorboard",
run_name="gemma_4b_lora_run_2",
logging_dir="gemma_4b_lora_run_2",
# max_seq_length=20000,
remove_unused_columns=False,
dataset_text_field="",
dataset_kwargs={"skip_prepare_dataset": True},
)
trainer = SFTTrainer(
model=self.model,
predict_with_generate=True,
train_dataset=self.train_dataset,
eval_dataset=self.val_dataset,
compute_metrics=self.compute_metrics,
processing_class=self.processor.tokenizer,
data_collator=self.collator,
args=config,
)
train_output = trainer.train()
Before running training, I check if the environment variable is set correctly (and it is):
However, it seems to have changed in the process of training, and is back to being 0.
What can I do here? I saw another issue about this, but it seemed like no one found a solution.
I am facing a similar issue
The os.environ only works for as long as the runtime of Python works, once it crashes, it resets back to normal. The bug, however, is that even if you set the UNSLOTH_RETURN_LOGITS as 1, Unsloth ignores it completely and doesn't return logits anyway! I've had the same issue both with Gemma 3 & Qwen3. I'm linking the full relevant error (although the problem is that an EmtpyLogits object is always returned no matter what):
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Unsloth: Will smartly offload gradients to save VRAM!
{'loss': 3.8112, 'grad_norm': 5.355050563812256, 'learning_rate': 0.0, 'epoch': 0.0}
{'loss': 3.8743, 'grad_norm': 5.253603935241699, 'learning_rate': 1e-05, 'epoch': 0.0}
{'loss': 4.029, 'grad_norm': 4.6202392578125, 'learning_rate': 2e-05, 'epoch': 0.01}
{'loss': 3.9538, 'grad_norm': 4.3493971824646, 'learning_rate': 1.999585749792875e-05, 'epoch': 0.01}
{'loss': 3.7101, 'grad_norm': 3.653374195098877, 'learning_rate': 1.99917149958575e-05, 'epoch': 0.01}
0%|β | 5/4830 [00:44<7:07:45, 5.32s/it]Unsloth: Not an error, but Qwen3ForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient
Traceback (most recent call last):
File "/home/user/neuraltranslate-nahuatl/qlora.py", line 153, in <module>
trainer_stats = trainer.train()
^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2237, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/memory.py", line 174, in decorator
return function(batch_size, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<string>", line 402, in _fast_inner_training_loop
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 3133, in _maybe_log_save_evaluate
metrics = self._evaluate(trial, ignore_keys_for_eval)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 3082, in _evaluate
metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 4249, in evaluate
output = eval_loop(
^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 4466, in evaluation_loop
logits = self.accelerator.pad_across_processes(logits, dim=1, pad_index=-100)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/accelerator.py", line 2938, in pad_across_processes
return pad_across_processes(tensor, dim=dim, pad_index=pad_index, pad_first=pad_first)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 407, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 677, in pad_across_processes
return recursively_apply(
^^^^^^^^^^^^^^^^^^
File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 128, in recursively_apply
raise TypeError(
TypeError: Unsupported types (<class 'unsloth_compiled_module_qwen3.EmptyLogits'>) passed to `_pad_across_processes`. Only nested list/tuple/dicts of objects that are valid for `is_torch_tensor` should be passed.
wandb:
wandb: :rocket: View run trainer_output at: https://wandb.ai/thermostatic/huggingface/runs/704epdbe
wandb: Find logs at: wandb/run-20250731_021242-704epdbe/logs
This was a bug that didn't happen in "unsloth==2025.6.2", as I fine-tuned a model and used a custom metric to evaluate it without problems. If I remember right, setting the environment variable wasn't even a requirement and it worked out of the box.
EDIT1: After some research I found that the unsloth package wasn't the issue, it's an issue with unsloth_zoo (or at least seems like it, unsloth_zoo==2025.6.2 fixes the bug when using the same unsloth version), I'm trying to find the root cause now.
EDIT2: As a temporary workaround, using unsloth_zoo==2025.7.1 with unsloth<=2025.6.5 fixes the issue (not sure if it causes other issues, tho), the breaking change was at 2025.7.2.
EDIT3: Indeed, even having the environment variable UNSLOTH_RETURN_LOGITS = "0" worked.
EDIT3: Indeed, even having the environment variable
UNSLOTH_RETURN_LOGITS = "0"worked.
That's probably because, once you reverted to unsloth_zoo==2025.7.1 then the automated detection of compute_metrics started working again here in rl.py (which, BTW, works by setting the environment variable UNSLOTH_RETURN_LOGITS='1' when compute_metrics is detected - so I'm guessing that unsloth_zoo==2025.7.1 allows the environment variable to persist)
Ah! Unfortunately, that fix does not work for me!
I used the versions of unsloth and unsloth_zoo that you mentioned.
Before running training, I also checked that UNSLOTH_RETURN_LOGITS='1'
However, even with this set up, I get this error: TypeError: Unsupported types (<class 'unsloth_compiled_module_gemma3.EmptyLogits'>) passed to_pad_across_processes. Only nested list/tuple/dicts of objects that are valid for is_torch_tensor should be passed.
Here'r my config now:
eval_config = {
"eval_strategy": "steps",
"eval_steps": self.train_args.get("eval_steps", 1),
"do_eval": True,
}
config = SFTConfig(
num_train_epochs= self.train_args.get("epochs", 3),
per_device_train_batch_size= self.train_args.get("batch_size", 4),
gradient_accumulation_steps= self.train_args.get("grad_accum", 8),
gradient_checkpointing= True,
gradient_checkpointing_kwargs= {"use_reentrant": False},
max_grad_norm= 0.3,
warmup_ratio= 0.03,
learning_rate= self.train_args.get("lr", 2e-4),
logging_steps= 1,
save_strategy= "steps",
save_steps= 10,
# loading the best model at end
load_best_model_at_end= self.train_args.get("load_best_model_at_end", True),
metric_for_best_model= self.train_args.get("metric_for_best_model", "eval_loss"),
greater_is_better= self.train_args.get("greater_is_better", False),
optim= self.train_args.get("optim", "adamw_torch_fused"),
weight_decay= 0.01,
lr_scheduler_type= "cosine",
seed= self.train_args.get("seed", 3407),
output_dir= self.output_dir,
report_to= "tensorboard",
run_name= "gemma_4b_lora_run_3",
logging_dir= "gemma_4b_lora_run_3",
# β visionβspecific args β
remove_unused_columns=False,
dataset_text_field="",
dataset_kwargs={"skip_prepare_dataset": True},
dataset_num_proc=self.train_args.get("dataset_num_proc", 4),
# max_seq_length=self.train_args.get("max_seq_length", 20000),
*
Where compute_metrics is defined as: def compute_metrics(self, p: EvalPrediction) -> Dict[str, float] and contains custom logic for calculating accuracy.
I wonder why it worked for you! What compute_metrics did you use?
@charvishukla-bc you are welcome to try this tweak I made to unsloth-zoo...
It solved the problem for me (but I still had to manually set UNSLOTH_RETURN_LOGITS='1' since the auto-detection of compute_metrics still isn't fixed). I would create a PR, but I'm still not satisfied with this solution... at compile time it hard-codes the user's setting of UNSLOTH_RETURN_LOGITS into the dynamically generated modules, since the environment variable method wasn't working. After that, I also had to uncomment several commented out lines for logit computation (see diff).
Really weird! I'll have to look into it too... if it's of any help here's the repo with all of the settings that have worked for me: https://github.com/Sekinal/neuraltranslate-en-es/tree/master
Currently using full.py with no issues in a B200.
Hello @charvishukla-bc @zaidalyafeai, if possible to provide a script that reproduces the issue that would be greatly appreciated.
For me the flag is set as os.environ["UNSLOTH_RETURN_LOGITS"] = "1" but in unsloth_compiled_cache/unsloth_compiled_module_gpt_oss.py line : 725 logits were not generated and set to empty logits elif self.loss_function.name.endswith("ForCausalLMLoss") and labels is not None: lm_head_weight = self.lm_head.weight lm_head_bias = getattr(self.lm_head, "bias", None)
# ========= NEW fused =========
_hidden_states = hidden_states[:, slice_indices, :]
torch._dynamo.mark_dynamic(_hidden_states, 1)
torch._dynamo.mark_dynamic(labels, 1)
loss = unsloth_compiled_fused_ce_loss_function(
hidden_states = _hidden_states,
lm_head_weight = lm_head_weight,
lm_head_bias = lm_head_bias,
output_labels = labels,
logit_scale_multiply = () if () != () else 0,
logit_scale_divide = () if () != () else 0,
logit_softcapping = () if () not in (None, (),) else 0,
vocab_size = (self.vocab_size),
n_items = n_items,
requires_grad_ = requires_grad_,
)
# ========= OLD non fused =========
# logits = self.lm_head(hidden_states[:, slice_indices, :].to(lm_head_weight.device))
# torch._dynamo.mark_dynamic(logits, 1)
# torch._dynamo.mark_dynamic(labels, 1)
# loss = unsloth_compiled_ce_loss_function(
# output_logits = logits,
# output_labels = labels,
# logit_scale_multiply = () if () != () else 0,
# logit_scale_divide = () if () != () else 0,
# logit_softcapping = () if () not in (None, (),) else 0,
# vocab_size = (self.vocab_size),
# n_items = n_items,
# requires_grad_ = requires_grad_,
# )
Facing same issue
Hello, custom evaluation and "UNSLOTH_RETURN_LOGITS="1" should work now, please update unsloth again via pip install --upgrade --force-reinstall --no-deps unsloth_zoo unsloth
This didn't look like it fixed the issue. Running unsloth==2025.10.9 unsloth_zoo==2025.10.10
Hello @jrobis apologies for that, do you mind attaching a sample script? I ran a compute_metric function that needed logits and it seems they were returned when I enabled the flag, I ran this for "unsloth/gemma-3n-E4B-it", with multi-modal input and just text. But, please let me know, thank you.
I am also getting this error - the flag is being unset both by running trainer.evaluate() and trainer.train(). It happens with the docker image unsloth/unsloth:latest but not with unsloth/unsloth:stable
In the case of unsloth:latest, the pip package versions are:
> pip freeze | grep unsloth
unsloth==2025.10.9
unsloth_zoo==2025.10.10
Here is a debugging log that shows UNSLOTH_RETURN_LOGITS being set to 0.
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(93)train()
-> trainer.evaluate()
(Pdb) os.environ['UNSLOTH_RETURN_LOGITS']
'1'
(Pdb) n
Unsloth: Not an error, but Qwen3ForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 5/5 [00:00<00:00, 5.82it/s]
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(94)train()
-> if logp_datasets:
(Pdb) os.environ['UNSLOTH_RETURN_LOGITS']
'0'
(Pdb) n
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(95)train()
-> os.environ['UNSLOTH_RETURN_LOGITS'] = '1'
(Pdb) n
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(96)train()
-> trainer.train()
(Pdb) os.environ['UNSLOTH_RETURN_LOGITS']
'1'
(Pdb) n
The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None}.
==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1
\\ /| Num examples = 88 | Num Epochs = 4 | Total steps = 24
O^O/ \_/ \ Batch size per device = 2 | Gradient accumulation steps = 8
\ / Data Parallel GPUs = 1 | Total batch size (2 x 8 x 1) = 16
"-____-" Trainable parameters = 66,060,288 of 4,088,528,384 (1.62% trained)
NotImplementedError: Unsloth: Logits are empty from 2024.11 onwards. To get raw logits again, please set the environment variable `UNSLOTH_RETURN_LOGITS` to `"1" BEFORE starting to train ie before `trainer.train()`. For example:
import os os.environ['UNSLOTH_RETURN_LOGITS'] = '1' trainer.train()
No need to restart your console - just add `os.environ['UNSLOTH_RETURN_LOGITS'] = '1'` before trainer.train() and re-run the cell!
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(96)train()
-> trainer.train()
(Pdb)
@nielsrolf @pluesclues
yes the unsloth packages in the latest tag of the unsloth container docker image did not yet contain the actual fix. We will be updating the container to use the latest unsloth+unsloth-zoo pypi release which will contain all the necessary fixes.
thanks
@rolandtannous Man please include FSDP mode too
The container image has been updated to 2025.10.13 which should contain the fix for this issue Please make sure to pull the latest image
docker pull unsloth:unsloth
Note: if you have, unsloth==2025.10.9 and unsloth_zoo==2025.10.10 in your environment, then you do not have the required fix. If you are in a local python environment
pip install --force-reinstall --no-deps unsloth-zoo unsloth
don't do this if you are using the docker container image (just pulled the latest image). The latest unsloth pypi versions are already built in.
is there a basic tutorial on How to launch Unsloth docker in kaggle?
I seem to be still hitting this bug in 2025.11.x
!pip list | grep unsloth
unsloth 2025.11.2
unsloth_zoo 2025.11.3
Here is the problem
File "/tmp/ipython-input-773422404.py", line 1, in <cell line: 0>
trainer_stats = trainer.train()
File "/content/unsloth_compiled_cache/UnslothSFTTrainer.py", line 53, in wrapper
self.model.for_training()
File "/usr/local/lib/python3.12/dist-packages/unsloth/models/vision.py", line 1234, in for_training
os.environ["UNSLOTH_RETURN_LOGITS"] = "0"
File "/tmp/ipython-input-215650878.py", line 9, in debug_setitem
for_training in vision.py sets it to zero.
EDIT:
I'm not sure how for_training() is getting called, it appears to be part of some generated code? But I was able to use the following workaround
import os
_orig_setitem = None
if _orig_setitem is None:
_orig_setitem = os.environ.__class__.__setitem__
def patch_environ():
def debug_setitem(self, key, value):
if key == 'UNSLOTH_RETURN_LOGITS' and value != '1':
print(f"Disallowing setting UNSLOTH_RETURN_LOGITS to {value}")
return
return _orig_setitem(self, key, value)
os.environ.__class__.__setitem__ = debug_setitem
patch_environ()
Any solution?