Rusheb Shah

Results 9 comments of Rusheb Shah

Hello, I would like to help with adding unit tests. I may find it hard to make all the changes all at once, and it would be good to work...

I think that all the tests for this are actually done as of #218. @jbloomAus maybe we should close this ticket and spin out a new one for docs.

Hi Tim, I think the work is up for grabs if you are keen. Probably worth checking with @jbloomAus what his priorities are and spinning out a ticket for anything...

Hi @tbenthompson, I've reproduced this on TransformerLens v1.2.2 What version of TransformerLens did you use to produce this? I'm just trying to narrow down how long this has been an...

Have been investigating this with @pranavgade20 today. We haven't got to the bottom of the issue, but leaving some notes here for anybody to pick up ## Circular reference The...

I couldn't repro the issue on CPU on OS X M1. Confusingly, the 410m TransformerLens model seems to be using up basically no RAM. **EDIT: I ran it for longer...

OK, after playing around with various tools, I ran the [fil profiler](https://pythonspeed.com/articles/python-server-memory-leaks/). As you can see in the screenshot below, the process is allocating nearly 15GB of memory to load...

Update! I've managed to get the memory usage down from 15GiB to 5GiB in my test. If you compare the diagram below with the one from my previous post, all...

Unfortunately when I reran @tbenthompson's reproducing example, the GPU usage is unchanged. Seems like this is a real issue, but it might be a separate one