Serge Gotsuliak

Results 40 comments of Serge Gotsuliak
trafficstars

Great! I've fixed a bit of your latest addition with Ring container, will wait for interactive mode :) I'm going to investigate some other things like AVX intrinsics and mmap()...

I've started grokking with NEON and AVX2: https://github.com/gotzmann/llama.go/tree/avx-neon After looking into the topic, it seems the most easiest way to start with is to use **MinIO** tooling advanced by **gorse**:...

> AVX-512 instructions can be used to accelerate operations on INT8 and INT4 data arrays. Unfortunately, AVX-512 support is fragmentary within Intel processors. It was removed recently even from CPUs...

> I don't really understand c++ as much as Go, but I'm at your disposal. Yeah, thanks! The most annoying things here: - Go has no clever vector intrinsics like...

> is this https://github.com/gotzmann/llama.go/blob/main/pkg/ml/ml.go the exact port of this (tensor program that run exactly like ggml in Go) https://github.com/ggerganov/ggml/blob/master/src/ggml.c? @umarrudy - exactly :) > if I could run [other model...

> @BrunoIsaac27 To use AVX2 instructions in Go, you can use assembly language and the go:generate directive. Having lost some days between debugging sessions on my Mac and PC I've...

Any progress out there? Have the same problem with torch-2.5.0 and liger-0.3.1 when FSDP-training involves **lm_head** layer ``` [rank4]: File "/usr/local/lib/python3.10/dist-packages/liger_kernel/transformers/fused_linear_cross_entropy.py", line 13, in forward [rank4]: return LigerFusedLinearCrossEntropyFunction.apply( [rank4]: File...

Seems this problem arises when **fsdp_use_orig_params: false** `fsdp_use_orig_params: If True, allows non-uniform requires_grad during init, which means support for interspersed frozen and trainable parameters. This setting is useful in cases...

> Just typing this out has put a new idea in my head. Looks cool! Good to have the feature, just shorten the name of tag maybe? Like `data-templ-let` or...