Shen Zhuoran issues

Results 11 issues of


                                            Shen Zhuoran

Add BeautyNet in Python section

[BeautyNet](https://github.com/cms-flash/beauty-net) is a high-quality, minimalist template for research projects in PyTorch.

What is the number of epochs of the final training?

The [config file](https://github.com/bigscience-workshop/bigscience/blob/b4a4f4651771cb78297abe5074aaf2de1f92d6ce/train/tr11-176B-ml/setup-test-n2.slurm) lists the sample count of the dataset as 220M and a global batch size of 2048, which equates to ~107K steps per epoch. The [main README](https://huggingface.co/bigscience/bloom/blob/main/README.md) says...

Is stereo depth estimation code available now?

Could you make a release when the model usage guide is available?

Could you make a release when the guide to use the pretrained models on our own papers is available?

Is there a TensorFlow/Keras implementation?

Is there a TensorFlow/Keras implementation of Adan? If no official version, do you know of any third-party implementation? Or alternatively, how many lines would you expect an implementation to have?...

Can you share inference time data for different AFF/iAFF models?

Hi Yimian, I came here from your WACV 2021 presentation. This work looks pretty impressive. As we discussed during your presentation, could you share the inference time data for different...

Add Graphite as extra-Google Piper

Add support for CUHK online dictionary.

Add option to use the [CUHK online dictionary](https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/) instead of the local dictionary. Queries are slower but more up-to-date.

Is KAN 10X slower per step of training, or does it need 10X many steps to converge?

Hi Ziming, In Section 6 of your paper, you mentioned that KANs are practically 10X slower than MLPs. I am curious what you meant by it. Did you mean a...

Have you considered TokenMix in hidden layers?

In addition to TokenMix before the first Transformer block, have you considered or tried TokenMix in the middle of the model?