linnan wang

https://linnanwang.github.io [email protected]

Brown University 菜

Results 43 comments of


                                            linnan wang

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

get_configs(batch=1, latency="1.25ms-D", gpuType="GV100") ``` GPUNet( (network): Sequential( (stem: 2): Prologue( (net): Sequential( (0): Conv2d(3, 33, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm2d(33, eps=0.001, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace=True)...

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

get_configs(batch=1, latency="2.25ms-D", gpuType="GV100") ``` GPUNet( (network): Sequential( (stem: 2): PrologueLargeD( (net): Sequential( (0): Conv2d(3, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU()...

segfault for large matrices in dgemm

That's correct. 30000x30000 double precision should go beyond of 6GB. I don't recommend testing double precision on TITAN. We normally use K40. You can try single, and it should delivery...

segfault for large matrices in dgemm

It can go any matrix size as long as you have enough host RAM (The current memory issue is due to the way I'm marking for finished tasks; however, it...

Code for Rnn predictor

That codes have not yet been cleaned, and it involves a lot of manual operations to get it working. It would be a lot of easy if you implement RNN...

large scale Dgemm segmentation fault

can you try a smaller case, e.g. 20000 x 20000?

large scale Dgemm segmentation fault

okay. It is already very impressive to see 2*10^4 case working on GPU. The purpose of this library is to demonstrate a new system design to support large scale matrix...

The problem with Meta-DNN

Sorry just double check our experiments. Yes, we used NASBench for designing validation of meta-DNN. I think the discrepancy in evaluating the val_correlation is at "how you split the dataset"....

CPU Level Parallelism

For CPU multi-threading, it depends on what CPU BLAS you link, and how you configure them. Please don't pay too much attention to CPU, this is a multiGPU BLAS. For...

CPU Level Parallelism

good catch. I just merged ZGEMM a few seconds ago.

‹
1
2
3
4
5
›