Will Stone

Results 7 comments of Will Stone

Same issue, though my dataset isn't nearly as big as @AlexandreLaborde's is.

tf.GraphDef was removed in tensorflow 2.0 I personally use tensorflow 1.14 when using this tool and have had no issues. It may work with other versions (1.x) as well but...

Yes that makes sense to me. Since we know there is nothing in that frame for the tracker to track it would make sense to explicitly tell it to stop...

Yeah, my plan was to implement something that (unfortunately) has to call the FFN `num_experts_per_token` times, each with a different adapter. Slow but low memory and... maybe has some benefits?...

Hey did you ever end up trying to train this? As soon as I turn even a small number of gradients on it OOMs pretty quick. I tried with my...

Nah, no time wasted. I was already playing with this idea before I saw your initial comments. I'm working on implementing the paper I linked above now. The authors released...