Ruicheng Wang
Ruicheng Wang
Hi! Thanks for your interest. Here are my suggestions for accelerating the inference: 1. Batchify the input image streams. Single-image inference may not fully utilize the GPU computation resources. Batch...
Hi. We've recently tested its performance with fp16 precision and found that the inference achieves **2x speed up on GPU without any loss of evaluation scores and visual distortion**, though...
Hi. Sorry for the late response. We haven't encountered divergence in training, but we do have a fix of normal loss in MoGe-2 to improve theoretical stability. MoGe used the...
网站是MoGe-1的结果。本地现在默认跑的MoGe-2由于训练数据的不同,mask prediction的偏好会和MoGe-1不一样。这个pattern是mask估计错误导致的(无法判断背景应该作为实体墙保留,还是应该作为非实体的背景去掉)。如果发现mask挂掉,可以将apply_mask置为False,只取原始depth。
精度不会变差的,apply_mask的区别只有是否把天空或纯色背景的深度替换为inf,不影响深度预测的结果。不过对于确实存在天空的图片,如果不apply_mask,天空区域会保留无意义的深度,看起来会比较奇怪。
Hi! The normalized camera intrinsics are defined such that the top-left corner of the image corresponds to (0, 0) and the bottom-right corner corresponds to (1, 1), whereas pixel-space intrinsics...
> Thank you for the reply! Are (H,W) the size of neural network inputs or the original image size? For the current demo of camera intrinsic prediction, are the output...
Hi. It looks like you're interpreting the predicted intrinsics as if they were in pixel space, which leads to an incorrect FOV computation—nearly 180 degrees. You're currently using: ``` fov_x_rad...
I'm sorry, but the model is not expected to predict an accurate scale for this image. Estimating metric scale requires the model to recognize common objects with known size as...
We use `conv_transpose` for the first three upsamplers and `bilinear` for the last one, as specified in the model configuration from the pretrained checkpoint: ``` "resamplers": ["conv_transpose", "conv_transpose", "conv_transpose", "bilinear"]...