Wang hl
Wang hl
> https://github.com/BlinkDL/RWKV-v2-RNN-Pile What kind of finetuning methods does this use? I think it tunes all parameters in the model?
I get a wonderful solution about this problem. Since the latest version of transformers support RWKV, I can now use peft to finetune RWKV. Here is the demo code: ```...
> assume that I have training data - json or tsv - in the format {"instruction": THE INSTRUCTION", input:"THE INPUT", output:"DESIRED OUTPUT"} how can I modify your peft code to...
I think HF llama does not have a static kv cache, since its cache is dynamically increased during generation. Here is the relavent code: https://github.com/huggingface/transformers/blob/38611086d293ea4a5809bcd7fadd8081d55cb74e/src/transformers/models/llama/modeling_llama.py#L1014C37-L1014C37 However, I also have the...
Thanks for your attention. I hope this code can help you for testing :) ``` class StochasticDuelingHead(nn.Module): """ Overview: The ``Stochastic Dueling Network`` proposed in paper ACER (arxiv 1611.01224). \...
Any new progress about this issue?
Same question. This lead to a strange situation. The final kl loss is computed like: `` kl_penalty = -self.kl_penalty_weight * (logprobs - ref_logprob) `` However, the part ``ref_logprob`` does not...