Serge Gotsuliak
Serge Gotsuliak
Great! I've fixed a bit of your latest addition with Ring container, will wait for interactive mode :) I'm going to investigate some other things like AVX intrinsics and mmap()...
I've started grokking with NEON and AVX2: https://github.com/gotzmann/llama.go/tree/avx-neon After looking into the topic, it seems the most easiest way to start with is to use **MinIO** tooling advanced by **gorse**:...
> AVX-512 instructions can be used to accelerate operations on INT8 and INT4 data arrays. Unfortunately, AVX-512 support is fragmentary within Intel processors. It was removed recently even from CPUs...
> I don't really understand c++ as much as Go, but I'm at your disposal. Yeah, thanks! The most annoying things here: - Go has no clever vector intrinsics like...
> is this https://github.com/gotzmann/llama.go/blob/main/pkg/ml/ml.go the exact port of this (tensor program that run exactly like ggml in Go) https://github.com/ggerganov/ggml/blob/master/src/ggml.c? @umarrudy - exactly :) > if I could run [other model...
> @BrunoIsaac27 To use AVX2 instructions in Go, you can use assembly language and the go:generate directive. Having lost some days between debugging sessions on my Mac and PC I've...
Any progress out there? Have the same problem with torch-2.5.0 and liger-0.3.1 when FSDP-training involves **lm_head** layer ``` [rank4]: File "/usr/local/lib/python3.10/dist-packages/liger_kernel/transformers/fused_linear_cross_entropy.py", line 13, in forward [rank4]: return LigerFusedLinearCrossEntropyFunction.apply( [rank4]: File...
Seems this problem arises when **fsdp_use_orig_params: false** `fsdp_use_orig_params: If True, allows non-uniform requires_grad during init, which means support for interspersed frozen and trainable parameters. This setting is useful in cases...
FIY @winglian
> Just typing this out has put a new idea in my head. Looks cool! Good to have the feature, just shorten the name of tag maybe? Like `data-templ-let` or...