ValueError: [conv] Expect the input channels in the input and weight array to match

Open arogister opened this issue 10 months ago • 1 comments

The Vision Transformer examples in the README don't work for me, I am getting this error in both cases :

File ~/Documents/mlx-embeddings/.venv/lib/python3.12/site-packages/mlx/nn/layers/convolution.py:157, in Conv2d.call(self, x) 156 def call(self, x): --> 157 y = mx.conv2d( 158 x, self.weight, self.stride, self.padding, self.dilation, self.groups 159 ) 160 if "bias" in self: 161 y = y + self.bias

ValueError: [conv] Expect the input channels in the input and weight array to match but got shapes - input: (1,384,3,384) and weight: (1152,14,14,3)

ValueError: [conv] Expect the input channels in the input and weight array to match but got shapes - input: (2,384,3,384) and weight: (1152,14,14,3)

Apr 04 '25 17:04 arogister

Hey @arogister

I will fix the readme examples later today

For now here is the solution: https://github.com/Blaizzy/mlx-embeddings/issues/18#issuecomment-2777111054

Apr 04 '25 17:04 Blaizzy