Blealtan
Blealtan
`state[5*i+1]` is the `x` in previous iteration, see [here](https://github.com/BlinkDL/ChatRWKV/blob/73aa9e04eb98609243f6be20c53d0b64b659aa8d/RWKV_in_150_lines.py#L71). It's exactly what `time_shift` do.
Simply went through the `jsinterop` implementation. It seems just implement another `IJSRuntime` like in `Mono.WebAssembly.Interop` would be good?
IMO the LoRA merging should go before the conversion since there are too many things happening during converting the model (including the xy quantizing, etc.). +1 on disallowing LoRA when...
Yeah MIT is intended. Will add soon.
That one happens somewhat randomly when reconstructing the splines on the new grid (performing `update_grid`)... Possibly some numerical failure in the Least Square optimization. I also encountered that in some...
Stop spamming. Also thank you for pointing the spamming out.
It's equivalent to activating the same hidden state with multiple activation functions and then use a wider linear transformation to shrink it back. Somewhat like Gated Linear Unit, but somewhat...
Yeah the Fourier variant definitely sounds attractive. I do have little time maintaining this right now (working on my PhD viva). Also, this repo mostly serves as a playground to...
The activation is a bit (or more, if your VRAM bandwidth is limited) slower, and the number of parameters is (by default) 5x over nn.Linear with the same input and...
正常的模型文件不应该出现这种问题。你是不是直接把LoRA的增量checkpoint路径填进了模型路径 `MODEL_NAME`,结果只加载了lora而没有加载基座?应该同时加载二者,LoRA的增量checkpoint路径是写在 `MODEL_LORA` 里的。