TabPFN Borders are potentially incorrectly inverted

Describe the bug

In the fit method of the the TabPFNRegressor class, the borders in renormalized_criterion_ are initialized by multiplying the pre-trained borders with the target standard deviation and adding the target mean (self.bardist_.borders * self.y_train_std_ + self.y_train_mean_). Before that, y was standardized, effectively applying a target transfomer outside of the config.

Afterwards, in the predict method, two cases can happen.

If the config has no target_transform then the original pretrained borders are used as the borders for translate_probs_across_borders.
However, if there is a target_transform in the config, the pre-trained borders are inverted with the help of the target_transform.

I see two possible problems:

Shouldn't the borders (in the borders_t variable) for the target_transform is None case, be set to self.renormalized_criterion.borders to account for the range of y?
How is the standardization of y outside of the target transformer taken into account in the second case, where a target transformer is provided from the config?

Steps/Code to Reproduce

No response

Expected Results

No response

Actual Results

No response

Versions

Feb 27 '25 15:02 Manuel3567

Dear Manuel,

The edge cases of the target transformer are tricky and a bit hard to understand. An example of a case where the code fails would be extremely helpful in understanding if there is an issue, when it arises and helps verify a fix. Do you have a failing example at hand?

Best!

Mar 13 '25 16:03 noahho

Hi Manuel,

Thank you for raising this question. After tracing through the code paths, we believe the border handling is consistent.

In fit() the target y is standardized and the “normalized” bar distribution is stored using the original scale:

mean, std = np.mean(y), np.std(y)
self.y_train_mean_ = mean.item()
self.y_train_std_ = std.item() + 1e-20
y = (y - self.y_train_mean_) / self.y_train_std_
self.normalized_bardist_ = FullSupportBarDistribution(
    self.bardist_.borders * self.y_train_std_ + self.y_train_mean_
).float()

In predict(), when target_transform is None, we keep the pretrained borders (std_borders). These already match the model’s standardized logits, so no extra scaling is needed.

The final conversion from logits to predictions uses self.normalized_bardist_, which rescales the results back to the original mean and std of y. This ensures the output remains in the correct space.

So while it may seem that predict() disregards the scaling, the combination of standardized training and the renormalization step actually makes the two paths consistent. If you have a concrete example where the predictions differ from what you expect, feel free to share it so we can dig deeper.

Best regards!

Jun 25 '25 21:06 noahho