unsloth [Bug] os.environ["UNSLOTH_RETURN_LOGITS"] = "1" becomes unset to "0" once I start to train

Hello!

I have been working on fine tuning Gemma 3. During training, I wish to validate based on a custom metric. To mitigate the following error, I set os.environ["UNSLOTH_RETURN_LOGITS"] = "1":

TypeError: Unsupported types (<class 'unsloth_compiled_module_gemma3.EmptyLogits'>) passed to `_pad_across_processes`. Only nested list/tuple/dicts of objects that are valid for `is_torch_tensor` should be passed.

I am using the following configuration:

 config = SFTConfig(
          per_device_train_batch_size=self.train_args.get("batch_size", 4),
          gradient_accumulation_steps=self.train_args.get("grad_accum", 8),
          gradient_checkpointing=True,
          gradient_checkpointing_kwargs={"use_reentrant": False},
          max_grad_norm=0.3,
          warmup_ratio=0.03,
          learning_rate=self.train_args.get("lr", 2e-4),
          logging_steps=10,

          save_strategy="steps",
          save_steps=10,

          eval_strategy="steps",            
          eval_steps=self.train_args.get("eval_steps", 10),
          load_best_model_at_end=self.train_args.get("load_best_model_at_end", True),
          metric_for_best_model=self.train_args.get("metric_for_best_model", "top1_accuracy"),
          greater_is_better=self.train_args.get("greater_is_better", True),
          
          optim=self.train_args.get("optim", "adamw_torch_fused"),
          weight_decay=0.01,
          lr_scheduler_type="cosine",
          seed=self.train_args.get("seed", 3407),
          output_dir=self.output_dir,
          report_to="tensorboard",
          run_name="gemma_4b_lora_run_2",
          logging_dir="gemma_4b_lora_run_2",
          # max_seq_length=20000,
          remove_unused_columns=False,
          dataset_text_field="",
          dataset_kwargs={"skip_prepare_dataset": True},
      )

      trainer = SFTTrainer(
          model=self.model,
          predict_with_generate=True,
          train_dataset=self.train_dataset,
          eval_dataset=self.val_dataset, 
          compute_metrics=self.compute_metrics,
          processing_class=self.processor.tokenizer,
          data_collator=self.collator,
          args=config,

      )
      train_output = trainer.train()

Before running training, I check if the environment variable is set correctly (and it is):

However, it seems to have changed in the process of training, and is back to being 0.

What can I do here? I saw another issue about this, but it seemed like no one found a solution.

Jul 30 '25 17:07 charvishukla-bc

I am facing a similar issue

Jul 30 '25 19:07 zaidalyafeai

The os.environ only works for as long as the runtime of Python works, once it crashes, it resets back to normal. The bug, however, is that even if you set the UNSLOTH_RETURN_LOGITS as 1, Unsloth ignores it completely and doesn't return logits anyway! I've had the same issue both with Gemma 3 & Qwen3. I'm linking the full relevant error (although the problem is that an EmtpyLogits object is always returned no matter what):

`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
Unsloth: Will smartly offload gradients to save VRAM!
{'loss': 3.8112, 'grad_norm': 5.355050563812256, 'learning_rate': 0.0, 'epoch': 0.0}
{'loss': 3.8743, 'grad_norm': 5.253603935241699, 'learning_rate': 1e-05, 'epoch': 0.0}
{'loss': 4.029, 'grad_norm': 4.6202392578125, 'learning_rate': 2e-05, 'epoch': 0.01}
{'loss': 3.9538, 'grad_norm': 4.3493971824646, 'learning_rate': 1.999585749792875e-05, 'epoch': 0.01}
{'loss': 3.7101, 'grad_norm': 3.653374195098877, 'learning_rate': 1.99917149958575e-05, 'epoch': 0.01}
  0%|▏                                                                                                                                   | 5/4830 [00:44<7:07:45,  5.32s/it]Unsloth: Not an error, but Qwen3ForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient
Traceback (most recent call last):
  File "/home/user/neuraltranslate-nahuatl/qlora.py", line 153, in <module>
    trainer_stats = trainer.train()
                    ^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2237, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/memory.py", line 174, in decorator
    return function(batch_size, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 402, in _fast_inner_training_loop
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 3133, in _maybe_log_save_evaluate
    metrics = self._evaluate(trial, ignore_keys_for_eval)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 3082, in _evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 4249, in evaluate
    output = eval_loop(
             ^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 4466, in evaluation_loop
    logits = self.accelerator.pad_across_processes(logits, dim=1, pad_index=-100)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/accelerator.py", line 2938, in pad_across_processes
    return pad_across_processes(tensor, dim=dim, pad_index=pad_index, pad_first=pad_first)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 407, in wrapper
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 677, in pad_across_processes
    return recursively_apply(
           ^^^^^^^^^^^^^^^^^^
  File "/home/user/neuraltranslate-nahuatl/.venv/lib/python3.12/site-packages/accelerate/utils/operations.py", line 128, in recursively_apply
    raise TypeError(
TypeError: Unsupported types (<class 'unsloth_compiled_module_qwen3.EmptyLogits'>) passed to `_pad_across_processes`. Only nested list/tuple/dicts of objects that are valid for `is_torch_tensor` should be passed.
wandb:
wandb: :rocket: View run trainer_output at: https://wandb.ai/thermostatic/huggingface/runs/704epdbe
wandb: Find logs at: wandb/run-20250731_021242-704epdbe/logs

This was a bug that didn't happen in "unsloth==2025.6.2", as I fine-tuned a model and used a custom metric to evaluate it without problems. If I remember right, setting the environment variable wasn't even a requirement and it worked out of the box.

EDIT1: After some research I found that the unsloth package wasn't the issue, it's an issue with unsloth_zoo (or at least seems like it, unsloth_zoo==2025.6.2 fixes the bug when using the same unsloth version), I'm trying to find the root cause now.

EDIT2: As a temporary workaround, using unsloth_zoo==2025.7.1 with unsloth<=2025.6.5 fixes the issue (not sure if it causes other issues, tho), the breaking change was at 2025.7.2.

EDIT3: Indeed, even having the environment variable UNSLOTH_RETURN_LOGITS = "0" worked.

Jul 31 '25 03:07 Sekinal

EDIT3: Indeed, even having the environment variable UNSLOTH_RETURN_LOGITS = "0" worked.

That's probably because, once you reverted to unsloth_zoo==2025.7.1 then the automated detection of compute_metrics started working again here in rl.py (which, BTW, works by setting the environment variable UNSLOTH_RETURN_LOGITS='1' when compute_metrics is detected - so I'm guessing that unsloth_zoo==2025.7.1 allows the environment variable to persist)

Jul 31 '25 19:07 davidsvaughn

Ah! Unfortunately, that fix does not work for me!

I used the versions of unsloth and unsloth_zoo that you mentioned.

Before running training, I also checked that UNSLOTH_RETURN_LOGITS='1'

However, even with this set up, I get this error: TypeError: Unsupported types (<class 'unsloth_compiled_module_gemma3.EmptyLogits'>) passed to_pad_across_processes. Only nested list/tuple/dicts of objects that are valid for is_torch_tensor should be passed.

Here'r my config now:

      eval_config = {
          "eval_strategy":        "steps",
          "eval_steps":           self.train_args.get("eval_steps", 1),
          "do_eval":              True,

      }

      config = SFTConfig(
          num_train_epochs=                  self.train_args.get("epochs", 3),
          per_device_train_batch_size=       self.train_args.get("batch_size", 4),
          gradient_accumulation_steps=       self.train_args.get("grad_accum", 8),
          gradient_checkpointing=            True,
          gradient_checkpointing_kwargs=     {"use_reentrant": False},
          max_grad_norm=                     0.3,
          warmup_ratio=                      0.03,
          learning_rate=                     self.train_args.get("lr", 2e-4),
          logging_steps=                     1,

          save_strategy=                     "steps",
          save_steps=                         10,

          # loading the best model at end
          load_best_model_at_end=             self.train_args.get("load_best_model_at_end", True),
          metric_for_best_model=              self.train_args.get("metric_for_best_model", "eval_loss"),
          greater_is_better=                  self.train_args.get("greater_is_better",  False),

          optim=                              self.train_args.get("optim", "adamw_torch_fused"),
          weight_decay=                       0.01,
          lr_scheduler_type=                 "cosine",
          seed=                              self.train_args.get("seed", 3407),
          output_dir=                        self.output_dir,
          report_to=                         "tensorboard",
          run_name=                          "gemma_4b_lora_run_3",
          logging_dir=                       "gemma_4b_lora_run_3",


          # — vision‑specific args —
          remove_unused_columns=False,
          dataset_text_field="",
          dataset_kwargs={"skip_prepare_dataset": True},
          dataset_num_proc=self.train_args.get("dataset_num_proc", 4),
          # max_seq_length=self.train_args.get("max_seq_length", 20000),

          *

Where compute_metrics is defined as: def compute_metrics(self, p: EvalPrediction) -> Dict[str, float] and contains custom logic for calculating accuracy.

I wonder why it worked for you! What compute_metrics did you use?

Jul 31 '25 20:07 charvishukla-bc

@charvishukla-bc you are welcome to try this tweak I made to unsloth-zoo... It solved the problem for me (but I still had to manually set UNSLOTH_RETURN_LOGITS='1' since the auto-detection of compute_metrics still isn't fixed). I would create a PR, but I'm still not satisfied with this solution... at compile time it hard-codes the user's setting of UNSLOTH_RETURN_LOGITS into the dynamically generated modules, since the environment variable method wasn't working. After that, I also had to uncomment several commented out lines for logit computation (see diff).

Jul 31 '25 22:07 davidsvaughn

Really weird! I'll have to look into it too... if it's of any help here's the repo with all of the settings that have worked for me: https://github.com/Sekinal/neuraltranslate-en-es/tree/master

Currently using full.py with no issues in a B200.

Aug 01 '25 00:08 Sekinal

Hello @charvishukla-bc @zaidalyafeai, if possible to provide a script that reproduces the issue that would be greatly appreciated.

Aug 01 '25 19:08 mmathew23

For me the flag is set as os.environ["UNSLOTH_RETURN_LOGITS"] = "1" but in unsloth_compiled_cache/unsloth_compiled_module_gpt_oss.py line : 725 logits were not generated and set to empty logits elif self.loss_function.name.endswith("ForCausalLMLoss") and labels is not None: lm_head_weight = self.lm_head.weight lm_head_bias = getattr(self.lm_head, "bias", None)

    # ========= NEW fused =========
    _hidden_states = hidden_states[:, slice_indices, :]
    torch._dynamo.mark_dynamic(_hidden_states, 1)
    torch._dynamo.mark_dynamic(labels, 1)
    loss = unsloth_compiled_fused_ce_loss_function(
        hidden_states        = _hidden_states,
        lm_head_weight       = lm_head_weight,
        lm_head_bias         = lm_head_bias,
        output_labels        = labels,
        logit_scale_multiply = () if () != () else 0,
        logit_scale_divide   = () if () != () else 0,
        logit_softcapping    = () if () not in (None, (),) else 0,
        vocab_size           = (self.vocab_size),
        n_items              = n_items,
        requires_grad_       = requires_grad_,
    )

    # ========= OLD non fused =========
    # logits = self.lm_head(hidden_states[:, slice_indices, :].to(lm_head_weight.device))
    # torch._dynamo.mark_dynamic(logits, 1)
    # torch._dynamo.mark_dynamic(labels, 1)
    # loss = unsloth_compiled_ce_loss_function(
    #     output_logits        = logits,
    #     output_labels        = labels,
    #     logit_scale_multiply = () if () != () else 0,
    #     logit_scale_divide   = () if () != () else 0,
    #     logit_softcapping    = () if () not in (None, (),) else 0,
    #     vocab_size           = (self.vocab_size),
    #     n_items              = n_items,
    #     requires_grad_       = requires_grad_,
    # )

Aug 10 '25 17:08 devlup

Facing same issue

Oct 14 '25 22:10 steveepreston

Hello, custom evaluation and "UNSLOTH_RETURN_LOGITS="1" should work now, please update unsloth again via pip install --upgrade --force-reinstall --no-deps unsloth_zoo unsloth

Oct 22 '25 01:10 pluesclues

This didn't look like it fixed the issue. Running unsloth==2025.10.9 unsloth_zoo==2025.10.10

Oct 23 '25 20:10 jrobis

Hello @jrobis apologies for that, do you mind attaching a sample script? I ran a compute_metric function that needed logits and it seems they were returned when I enabled the flag, I ran this for "unsloth/gemma-3n-E4B-it", with multi-modal input and just text. But, please let me know, thank you.

Oct 23 '25 21:10 pluesclues

I am also getting this error - the flag is being unset both by running trainer.evaluate() and trainer.train(). It happens with the docker image unsloth/unsloth:latest but not with unsloth/unsloth:stable

In the case of unsloth:latest, the pip package versions are:

> pip freeze | grep unsloth
unsloth==2025.10.9
unsloth_zoo==2025.10.10

Here is a debugging log that shows UNSLOTH_RETURN_LOGITS being set to 0.

> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(93)train()
-> trainer.evaluate()
(Pdb) os.environ['UNSLOTH_RETURN_LOGITS']
'1'
(Pdb) n
Unsloth: Not an error, but Qwen3ForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient
100%|█████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00,  5.82it/s]
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(94)train()
-> if logp_datasets:
(Pdb) os.environ['UNSLOTH_RETURN_LOGITS']
'0'
(Pdb) n
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(95)train()
-> os.environ['UNSLOTH_RETURN_LOGITS'] = '1'
(Pdb) n
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(96)train()
-> trainer.train()
(Pdb) os.environ['UNSLOTH_RETURN_LOGITS']
'1'
(Pdb) n
The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None}.
==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 88 | Num Epochs = 4 | Total steps = 24
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 8
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 8 x 1) = 16
 "-____-"     Trainable parameters = 66,060,288 of 4,088,528,384 (1.62% trained)
NotImplementedError: Unsloth: Logits are empty from 2024.11 onwards. To get raw logits again, please set the environment variable `UNSLOTH_RETURN_LOGITS` to `"1" BEFORE starting to train ie before `trainer.train()`. For example:

import os os.environ['UNSLOTH_RETURN_LOGITS'] = '1' trainer.train()

No need to restart your console - just add `os.environ['UNSLOTH_RETURN_LOGITS'] = '1'` before trainer.train() and re-run the cell!
> /Users/nielswarncke/Documents/openweights/openweights/jobs/unsloth/training.py(96)train()
-> trainer.train()
(Pdb)

Oct 31 '25 13:10 nielsrolf

@nielsrolf @pluesclues

yes the unsloth packages in the latest tag of the unsloth container docker image did not yet contain the actual fix. We will be updating the container to use the latest unsloth+unsloth-zoo pypi release which will contain all the necessary fixes.

thanks

Oct 31 '25 15:10 rolandtannous

@rolandtannous Man please include FSDP mode too

Oct 31 '25 15:10 steveepreston

The container image has been updated to 2025.10.13 which should contain the fix for this issue Please make sure to pull the latest image

docker pull unsloth:unsloth

Note: if you have, unsloth==2025.10.9 and unsloth_zoo==2025.10.10 in your environment, then you do not have the required fix. If you are in a local python environment

pip install --force-reinstall --no-deps unsloth-zoo unsloth

don't do this if you are using the docker container image (just pulled the latest image). The latest unsloth pypi versions are already built in.

Nov 02 '25 11:11 rolandtannous

is there a basic tutorial on How to launch Unsloth docker in kaggle?

Nov 07 '25 15:11 steveepreston

I seem to be still hitting this bug in 2025.11.x

!pip list | grep unsloth

unsloth                                  2025.11.2
unsloth_zoo                              2025.11.3

Nov 10 '25 17:11 richiejp

Here is the problem

  File "/tmp/ipython-input-773422404.py", line 1, in <cell line: 0>
    trainer_stats = trainer.train()
  File "/content/unsloth_compiled_cache/UnslothSFTTrainer.py", line 53, in wrapper
    self.model.for_training()
  File "/usr/local/lib/python3.12/dist-packages/unsloth/models/vision.py", line 1234, in for_training
    os.environ["UNSLOTH_RETURN_LOGITS"] = "0"
  File "/tmp/ipython-input-215650878.py", line 9, in debug_setitem

for_training in vision.py sets it to zero.

EDIT:

I'm not sure how for_training() is getting called, it appears to be part of some generated code? But I was able to use the following workaround

import os

_orig_setitem = None
if _orig_setitem is None:
  _orig_setitem = os.environ.__class__.__setitem__

def patch_environ():
    def debug_setitem(self, key, value):
        if key == 'UNSLOTH_RETURN_LOGITS' and value != '1':
            print(f"Disallowing setting UNSLOTH_RETURN_LOGITS to {value}")
            return
        return _orig_setitem(self, key, value)
    
    os.environ.__class__.__setitem__ = debug_setitem

patch_environ()

Nov 11 '25 07:11 richiejp

Any solution?

Nov 21 '25 16:11 steveepreston