Jonas Rohw

Results 2 comments of Jonas Rohw

@joelburget I am working on https://github.com/jonasrohw/TransformerLens/tree/OLMo; I think your MoE is very similar. I found the issue you were facing: the tokenizer is called again after `tokenizer_with_bos = utils.get_tokenizer_with_bos(tokenizer)`. Maybe...

@joelburget Exactly. You can also conditionally add the MoE weights import into the Olmo file. You could include your model names, etc., in the preloading with the exact model configurations...