Georgi Gerganov
Georgi Gerganov
I see - should be fixed on `master` now. Thanks for reporting this
> I guess this is the reason why 1D tensors needs to be stored as F32 in the model files. Not exactly - it is a similar problem, but it...
Can't test the `ggml_mul_mat_pad` with CUDA because it would need some extra changes to `to_fp16_cuda`. Likely there will be no benefit in this case because cuBLAS will probably make the...
Sounds great - no rush on this padding problem
Yes, dimension order is reversed compared to Python: https://github.com/ggerganov/ggml/issues/500#issuecomment-1704322898
Maybe as a sequence of views and 4D permutes: ```py x_view = x.view(b*c, h, scale, w*scale) x_view = x_view.permute(0, 2, 1, 3) x_view = x.view(b*c*scale, h, w, scale) x_view =...
We should wait for https://github.com/ggerganov/llama.cpp/issues/5356 to see if they come up with a way to re-organize the Vulkan shaders
@abetlen Ok, let's do that
If `ne[0] == 1` and `ne[1] == 2` (i.e. a single-column matrix), `ggml_n_dims()` will return 2 which is correct.
Yes, this is correct, but I don't think it will cause any problems. Is there a specific problem that you run into because of this?