Zheng Chen
Zheng Chen
I found Mamba2 is much faster than full self-attention block. But I met a memory problem. I used 12 layers of Mamba2 in the vision task. `d_model=128, d_state=16, head_dim =...
What value did you set for the near and far plane when training NeRF & Gaussian Splatting?
I try to flip the y and z axes direction of the camera-to-world matrix but it seems that it is not enough. when I train generalizable Gaussian Splatting on DL3DV....
Traceback (most recent call last): File "/group/40046/public_datasets/3d_datasets/DL3DV-10K/download.py", line 229, in assert params.subset in ['1K', '2K', '3K', '4K', '5K', '6K', '7K'], 'Only support subset 1K-7K so far' AssertionError: Only support subset...
I tried to use RWKV(e.g., Vision-RWKV) in CV tasks. But I found RWKV shows similar GPU memory occupancy to full-attention Transformer (like ViT) when training. I found both RWKV and...
Is the depth for fisheye image defined as the z-depth?
Hi! I want to inference the monocular depths of multiple images in one batch. Do you have any idea how to make it?