openpi icon indicating copy to clipboard operation
openpi copied to clipboard

Question about using quantiles in pi 0.5 when fine tune.

Open emoPointer opened this issue 1 month ago • 5 comments

Hello,

First, thank you for your excellent work!

I am currently working on fine-tuning the pi0.5 model using my own custom dataset on ARX arm. During this process, I've encountered a potential issue related to the use_quantile_norm parameter that significantly impacts performance.

📝 Description of the Issue

I have observed two distinct outcomes:

  1. When I manually modify the code to disable use_quantile_norm (set it to False), my fine-tuning process works very well, and the model achieves good performance.
  2. However, when use_quantile_norm is enabled (which appears to be the default for pi0.5), the model's performance during fine-tuning is much worse. The performance is shown in the video below.

https://github.com/user-attachments/assets/825fdfec-f5d3-44e6-abeb-8d6805903de9

| use_quantile_norm = False |

https://github.com/user-attachments/assets/398202c8-49a0-4c82-87ef-229c92abc720

| use_quantile_norm = True |

Evidence in Code

I investigated the codebase and found that this behavior seems to be hard-coded. In the file openpi/src/openpi/training/config.py, inside the DataConfigFactory.create_base_config method, the parameter is set as follows:

Image
# openpi/src/openpi/training/config.py (around line 186)

    use_quantile_norm=model_config.model_type != ModelType.PI0,
)

This line of code automatically sets use_quantile_norm to True for any model that is not ModelType.PI0 (which includes pi0.5). This prevents users from disabling it during fine-tuning without altering the core code.

💡 My Hypothesis

My suspicion is that this hard-coded value, while potentially correct for pre-training, may not be suitable for fine-tuning scenarios.

My hypothesis is that for fine-tuning, use_quantile_norm should ideally be disabled (False).

The reason is that fine-tuning datasets are often much smaller than the large-scale pre-training datasets. Applying quantile normalization to a small data distribution might be too aggressive, potentially clipping or distorting a significant amount of useful data (e.g., valid actions/observations). This could explain the severe performance degradation I am observing.

🤔 Question and Proposed Solution

My questions for the maintainers are:

  • Is this hard-coded behavior intended, even for fine-tuning?
  • Would you be open to making this parameter configurable, or defaulting it to False specifically for fine-tuning tasks?

If the team agrees that this is a valid concern, I would be happy to submit a Pull Request to address it.

Thank you for your time and consideration!

emoPointer avatar Oct 31 '25 17:10 emoPointer