scholar icon indicating copy to clipboard operation
scholar copied to clipboard

Unification of shape checks

Open msluszniak opened this issue 10 months ago • 9 comments

As mentioned here.

msluszniak avatar Apr 06 '24 13:04 msluszniak

Is this really completed?

krstopro avatar May 16 '24 20:05 krstopro

Errr, wrong issue.

josevalim avatar May 16 '24 21:05 josevalim

At the same time, I am not sure if there is something to unify here? I think checking for shapes should be done per module, as they can provide better error messages within their context?

josevalim avatar May 16 '24 21:05 josevalim

At the same time, I am not sure if there is something to unify here? I think checking for shapes should be done per module, as they can provide better error messages within their context?

Could be the case. My idea was to have several functions for the cases that keep repeating, e.g. if x is of rank 2, if x and y agree on the first axis, etc.

krstopro avatar May 16 '24 21:05 krstopro

The problem is that the error message ends up being generic: "expected tensors to have matching leading dimensions" while today we can say "expected the data to have the same dimension as the query" etc. We actually had those helpers inside Nx itself at the beginning but we removed them because of that once we introduced case. :)

josevalim avatar May 16 '24 22:05 josevalim

I am not sure whether this comment is fully on-topic, but on linear models, I find the behaviour of the api to be inconsistent when it comes to data shapes.

You can fit all linear models with a target shaped {n_samples, 1}. But when you call predict, some models yield an ArgumentError with the following message: "dot/zip expects shapes to be compatible, dimension 1 of left-side (1) does not equal dimension 0 of right-side (10)". The remaining models have a multioutput option, thus they can take an {n_samples, 1} shaped target. Leading to this inconsistency.

The models yielding this error are:

  • Scholar.Linear.BayesianRidgeRegression
  • Scholar.Linear.IsotonicRegression
  • Scholar.Linear.LinearRegression
  • Scholar.Linear.SVM

The models not yielding this error are:

  • Linear.PolynomialRegression
  • Linear.RidgeRegression

A livebook showcasing this behaviour of linear models is available here.

I think all linear models should work fine with {n_samples, 1} and {n_samples} shaped targets. If you all agree with that, I would be happy to work on this issue of linear models.

JoaquinIglesiasTurina avatar Jun 14 '24 12:06 JoaquinIglesiasTurina

Sure, that is really strange behaviour and not supposed to happen, so feel free to work on it. I'll appreciate that :)

msluszniak avatar Jun 14 '24 14:06 msluszniak

I've taken a further look at the issue with linear models. I think there is a decision to be made on how to handle the situation:

  • Should the output of an {n_samples, 1} and {n_samples} shaped target be the same? Or should it be different?

Meaning, RidgeRegression returns {1, n_samples} shaped coefficients for {n_samples, 1} shaped targets. While for {n_samples} targets, it returns {n_samples} shaped coefficients. This is the behaviour of scikit's ordinary least squares. This behaviour can be achieved by fixing some Nx.dot/2 to Nx.dot/4.

A different approach would be, shape-check the target, and make sure to flatten it prior to fitting the model. This would ensure that {n_samples, 1} and {n_samples} shaped targets yield equally shaped coefficients. This approach has some points of significance:

  • Inconsistency with scikit's api (I don't know if this is a problem).
  • It raises the question: How do we handle linear models with multioutput options?
  • It's potentially a breaking change of RidgeRegression's api.

I mathematically favor the second option, when linear models are described mathematically, the target is a column vector and actually fitting a column vector and a vector should yield the same results. However, I feel like that is the riskier approach, it can yield to some inconsistencies and leaving things as they are is the safer choice.

Looking forward to your comments.

JoaquinIglesiasTurina avatar Jun 22 '24 19:06 JoaquinIglesiasTurina

I think that we may try the second direction, and if it will introduce some breaking changes to RidgeRegression then we will fix it.

msluszniak avatar Jun 26 '24 17:06 msluszniak