nam-drun
nam-drun
@jeromeku So far from talking to @danielhanchen, I only know that unsloth only works in Lora, and so only mamba-2.8b-hf on HuggingFace can be done with unsloth. - Code for...
> Try loading the sft adapter first. Then merge the adapter into the base model and than load the dpo adapter. U can use the following code: Where do I...
fair enough, this will be the topic for my thesis then :P
> For Mamba you'll have to use the scan formulation as documented in the [paper](https://arxiv.org/abs/2312.00752). I get what you mean (algorithm 2 in figure 2). I was confused a bit...
Just normal Conv3x3-conv and I noticed the paper didn't say anything about really small filter size. I plan to use it in U-Net Diffusion
> Not yet, but it's something we're very interested in looking into soon. > > To help us target better - what models do you want to use it for?...