nerfstudio
nerfstudio copied to clipboard
Evaluating the results from a video using ns-eval
I am currently working on a video project where I need to reconstruct an object from the video. To start with, I extracted frames from the video using three different settings - 100, 300, and 600 frames. I then trained a model using nerfacto (default settings) and noticed that the quality of the extracted obj file decreased as I increased the number of frames. However, I did not have any quantitative metrics to compare the results of each experiment, so I decided to run ns-eval for each experiment to determine the PSNR, SSIM, etc. Interestingly, the results of the evaluation increased as I increased the number of frames, while the quality of the 3D object result did not. The text inside the object became less readable with more frames and started to vanish.
I believe that the reason for this is that ns-eval tries to compare the rendered video resulting from the model (Nerf) with the original footage, and when I fed more frames, the rendered video became more similar to the video (maybe not?). Additionally, I did not have one specific hold-out dataset to compare my results within each experiment. For example, in the 100 frames experiment, I used 10 images for validation, while in the 600 frames experiment, I used 60 images for validation. Meaning that the ns-eval results are not comparable as the images in validation are different. Even I think by having a hold-out dataset, the results of ns-eval does not reflect the quality of the 3D model.
My question is: how can I quantify the results of the obj file in each experiment, rather than the rendered video? Or, perhaps I have misunderstood something, and my understanding is incorrect? Can you please help me to clarify these points?
PS: I am looking for an approach to quantitatively evaluate the 3D model.