Shaojie Bai comments

Results 11 comments of


                                            Shaojie Bai

Expected a 'cuda' device type for generator (related to speed issues?)

Hi @polo5 , Thanks for your interest in our paper! Yes, MDEQ-Large does take a few hours to finish all epochs, so your calculation is correct. However, I should note...

Expected a 'cuda' device type for generator (related to speed issues?)

Also re: error: I haven't encountered this error for this repo before, but I'll check for sure for PyTorch 1.10!

Expected a 'cuda' device type for generator (related to speed issues?)

The issue with PyTorch

TCN不定长

对的，同时你可以把长度相似的sequence preprocess然后group在一起，这样pad的数量可以少一些

Broyden defeats the purpose of DEQs?

Hello @polo5, Thanks for your interest in our repo and DEQ! To begin with, we want to caution that "constant memory cost" is constant w.r.t. the number of layers. That...

Broyden defeats the purpose of DEQs?

Also, I want to add that in Anderson we usually keep `m=5` or `m=6`, which is usually significantly smaller than the number of solver iterations (e.g., 25 in DEQ-Transformer).

Broyden defeats the purpose of DEQs?

Hi @polo5, 1. Interesting observation on MDEQ...! I didn't know that you can achieve the same accuracy on CIFAR-10 with just 1 iteration but it's likely closely related to the...

Extend this to 1-D

Thanks for the pointers. Where can I find the documentations for `THCudaTensor` in PyTorch? Also, why don't we need `im2col` anymore? While the input tensor is `L x C` now...

Build problem with mi.py

Simply changing it to 0 worked for me!

Mnist classification problem

Oh, we flatten each image to 1D. For example, a 28x28 image is converted to a 784x1 sequence (i.e., length 784). So each "time step" to the TCN is essentially...