Neel Nanda
Neel Nanda
I think FNR of 42% significantly under-rates lateral flow tests. The thing I care about is not being infected, infectiousness increases with viral load, and lateral flow tests are _way_...
Interesting! What's the use case? Either way, it'd be an easy fix to just add an optional pad_token_id parameter to the generate function, feel free to make a PR
Thanks! I'll admit that those takes were too in depth for me to really get my head around them, but it sounded interesting and I would love to see a...
Ah, yes, if you're imagining a real interactive visualisation, putting it in CircuitsVis seems more natural. It's set up to be easy to integrate Javascript code and Python. On Tue,...
Interesting, I wasn't aware of that parameter, thanks! Looks like [it's new to torch 2.0](https://pytorch.org/docs/1.10.0/generated/torch.nn.Module.html?highlight=load_state_dict#torch.nn.Module.load_state_dict). I _think_ that TransformerLens doesn't require torch 2.0, so can't use it. On the other...
I'm to pushback, but currently think that hook_normalized is working as intended, because it's invariant between folding layer norm On Sat, 26 Aug 2023, 5:15 pm Haoyan Luo, ***@***.***> wrote:...
Oh rip, that's probably because I explicitly wrote the caching in the get cache fwd and bwd function, and I think it just does every single hook in the model....
Cool! If you can get caching and patching to work, this would be a very exciting addition. It'd be best to support as many models as possible, but even a...
Seems reasonable to me, I'd be happy for someone to add this On Fri, 8 Sept 2023 at 18:44, Ben Thompson ***@***.***> wrote: > It would be nice to have...
Maybe covered by #125