nerf
nerf copied to clipboard
Questions about using a custom dataset
Hello, I'm also trying to train NeRF with 360 captures scene (the AIST++ dataset https://aistdancedb.ongaaccel.jp
)
and I had some questions related to formatting the dataset and setting configurations:
-
For real, 360 captured scenes like in the AIST++ dataset, should I be using the "LLFF" or "blender" dataset format?
-
If one uses the "LLFF" dataset format and the
--spherify
flag (as recommended in README.md), then how can one generate therender_poses
for 360 degrees free-viewpoint video such that they lie in the same plane as the original 8 cameras?
- the
render_poses
outputed by thespherify
method doesn't lie on the same plane as does my 8 cameras do, as shown below in the visualization of the camera poses with pytransform3d

- When using the
--no_ndc
flag, how should we set thenear
andfar
bounds?
- I tried using COLMAP for these, but the dataset I'm trying to use (the AIST++ dataset
https://aistdancedb.ongaaccel.jp
) only has 8 views and therefore doesn't pickup enough keypoints to do reconstruction - would it be ok to set
near
andfar
as small enough and large enough values (e.g. 0. and a value large than distance between opposite cameras) respectively? - can
near
andfar
be understood as the closest and furthest distance along the camera-axis (z-axis) where there exists some scene content?
Thank you in advance for reading this long post!
I also encountered the same problem. I don't know how to solve it
I also had the same issue and my solution was to change the calculation of the up vector in function spherify_poses. Instead of averaging the camera-to-center vectors to find the up vector, I calculate the normal of the plane derived from the camera positions. This works only for poses roughly in a plane and not for poses in a hemisphere like in most 360 scenes. See the images below, where the left plots are of a scene captured in a hemispherical trajectory and the right one is of a scene captured in a circular trajectory. The rendered poses are in blue and the transformed camera poses are in green.
With original up vector calculation:
With proposed up vector calculation:
I calculated the new up vector as follows:
svd = np.linalg.svd((poses[:, :3, 3] - center).T)
up = svd[0][:, -1]
You can also switch automatically between the two up vector calculations by checking the std of the eigenvalues (svd[1]) which is higher for circular captures and lower for hemispherical ones. I found a threshold of 8 works for the datasets I have tested.
I hope that helps. :)
These plots are very useful for clarification and visualization, it would be nice if can give more detail on the plotting methods.
@WeiLi-THU For the visualization, I used a library from github called extrinsic2pyramid.
Has anybody solved this issue regarding near and far bounds ?