Logan Hallee

Results 40 comments of Logan Hallee

Hi! I have nothing to do with mistral but can answer your questions. Gates or routers are always linear layers, even in switch transformers. Regular linear layers, or sets of...

I second this question. The field sorely needs automatic structuring of transformer like diagrams, especially with the introduction of MOE and state space models. ![image](https://github.com/alexlenail/NN-SVG/assets/72926928/2f11b7ae-7107-4f67-8202-ad2a6825ff29) How to do this effectively?...

Parsing to code to generate a diagram would be much better! For example, orienting all the layers correctly from a print of a Pytorch model or something.

Yep, will be happy to help make some tests and docs for each algorithm after we finish commenting the tensor shape at each step.

@mariosasko, @lhoestq, @albertvillanova Hello! Can anyone help? or can you guys suggest who can help with this?

> Hi ! Feel free to download the dataset and create a `Dataset` object with it. > > Then your'll be able to use `push_to_hub()` to upload the dataset to...

Hi @lhoestq and @albertvillanova , just following up about this.

Sure, that makes sense. However, isn't there a size limit to what typical users can push?

> Yes there is a limit, simply let us know by email at datasets [at] huggingface.co - this way we can give you a storage grant also help making sure...

Hi @bj600800 @peiyaoli @lzygitk7 @chaofan520 @tangmeiaoxue1 @sermare @nullland1027 , My group has put together an implementation of ESMC called [ESM++](https://huggingface.co/Synthyra/ESMplusplus_small) that is completely Huggingface compatible. It loads with AutoModel and...