hychiang
hychiang
Hi, I am trying to interpret depth value in depth.pgm file after I unpack the .sens file by SensReader. I read depth.pgm file by an online searched python script: `def...
I am wondering why the resolution in auto-encoder (first stage config) is 256 not 512. Thanks! https://github.com/CompVis/latent-diffusion/blob/main/models/ldm/semantic_synthesis512/config.yaml#L42
Hi, I tried to reproduce your experiment with Cifar10, but I got training loss NaN. I am using a four GPUs machine with tensorflow-gpu 1.12 for the experiment.  Here...
Hi Ptrblk, I am playing with PyTorch batchnorm2d and your implementation. I tried to use your implementation in mobilenetv3 and the performance seems similar. However, I found the gradient values...
### Description Hi, I am trying to run MobileNetV2 on the Edge TPU with a Dev Board Mini. I follow the instructions and run the classification example code on my...
Hello, could I use `FastLinearCombinationClamp` to convert `half_t` accumulator to `int8_t` output? or it only supports `int32_t` accumulator to `int8_t` output? Thanks! ```c++ using ElementInputA = cutlass::half_t; // ; ```...
**What is your question?** As I read the document and the examples, the configurations I found for int8 x int8=int8 matrix multiplication is either RowMajor x ColumnMajor = ColumnMajor ([gemm_s8t_s8n_s8n](https://github.com/NVIDIA/cutlass/blob/main/test/unit/gemm/device/gemm_s8t_s8n_s8n_tensor_op_s32_sm80.cu))...
### System Info Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points. - `transformers` version: 4.41.2 - Platform: Linux-5.15.0-112-generic-x86_64-with-glibc2.35 - Python version: 3.10.13 -...
Hello, I read this [issue](https://github.com/NVIDIA/cutlass/issues/702#issuecomment-1331414081): * `kernel::GemmUniversal` with mode `GemmUniversalMode::kGemmSplitKParallel` will be equivalent to `kernel::GemmSplitKParallel`. The difference comes to fore for the `device::`-scoped kernels, wherein `device::GemmSplitKParallel` calls a reduction kernel...