gaussian-splatting
gaussian-splatting copied to clipboard
How to use render.py to generate a set of rendered images in provided path by myself?
I have found that images.bin and cameras.bin (COLMAP type) are extrinsics and intrinsics of cameras respectively.
And we can use read_extrinsics_binary
and read_intrinsics_binary
to read them.
By default, cameras.bin(intrinsics) is a dict just like:
id: 1
model: PINHOLE
width: 1912
height: 1075
params: [ 994.91473032 1035.45535543 956. 537.5 ]
By default, images.bin(intrinsics) is a list whose dict quantity is as same as the quantity of input images, a dict just like:
id: 1
qvec: [ 9.87182098e-01 4.67789801e-04 -1.59166096e-01 -1.17234620e-02]
tvec: [ 0.71418639 -0.12678332 2.64190678]
camera_id: 1
name: frame_0.png
xys: [[1.00696456e+03 4.94545990e-02]
[1.02521076e+03 3.03924482e-01]
[1.14496957e+03 6.49794588e-01]
...
[7.69126962e+02 5.17357767e+02]
[1.19740032e+03 2.12154789e+02]
[1.36614270e+03 6.43028234e+02]]
# xys.shape: (8807, 2)
point3D_ids: [-1 -1 -1 ... -1 -1 -1]
# len(point3D_ids): 8807
#
If I provide a specific path by myself, I have to change images.bin
. But I don't know how to edit xys
and point3D_ids
when the viewpoint is not in the viewpoints of my input images.
I did not edit images.bin and cameras.bin, or .txt files.
But I can tell you how to use your own camera path:
- define these three parameters: azimuth, elevation, distance.
- use a funtion convert to R,T: (R, t) = LookAt(azimuth, elevation, distance) (eg. in pytorch3d, or yourself')
- CameraRt_to_WorldView(R, t) and FullProject(WorldView, nearZ,farZ, fovX,fovY)
- use the output of step#3 as GS input.
all code in : https://github.com/graphdeco-inria/gaussian-splatting/issues/350
I have found that images.bin and cameras.bin (COLMAP type) are extrinsics and intrinsics of cameras respectively. And we can use
read_extrinsics_binary
andread_intrinsics_binary
to read them. By default, cameras.bin(intrinsics) is a dict just like:id: 1 model: PINHOLE width: 1912 height: 1075 params: [ 994.91473032 1035.45535543 956. 537.5 ]
By default, images.bin(intrinsics) is a list whose dict quantity is as same as the quantity of input images, a dict just like:
id: 1 qvec: [ 9.87182098e-01 4.67789801e-04 -1.59166096e-01 -1.17234620e-02] tvec: [ 0.71418639 -0.12678332 2.64190678] camera_id: 1 name: frame_0.png xys: [[1.00696456e+03 4.94545990e-02] [1.02521076e+03 3.03924482e-01] [1.14496957e+03 6.49794588e-01] ... [7.69126962e+02 5.17357767e+02] [1.19740032e+03 2.12154789e+02] [1.36614270e+03 6.43028234e+02]] # xys.shape: (8807, 2) point3D_ids: [-1 -1 -1 ... -1 -1 -1] # len(point3D_ids): 8807 #
If I provide a specific path by myself, I have to change
images.bin
. But I don't know how to editxys
andpoint3D_ids
when the viewpoint is not in the viewpoints of my input images.
If your model is already trained, you don't need to care about the xys
and point3D_ids
, changing both tvec
and qvec
will allow you to render a view from a novel viewpoint :)
Hi, I have a similar question as you, I am trying to fuse two data generated from colmap(same scene, but different images), I am also wondering what is the variable point3d_ids? Is it used in render and SIBR? Because I fuse only the point.bin files from these two data which are generated from colmap, just interplate the points from one file into the other one.
I did not edit images.bin and cameras.bin, or .txt files.
But I can tell you how to use your own camera path:
1. define these three parameters: azimuth, elevation, distance. 2. use a funtion convert to R,T: (R, t) = LookAt(azimuth, elevation, distance) (eg. in pytorch3d, or yourself') 3. CameraRt_to_WorldView(R, t) and FullProject(WorldView, nearZ,farZ, fovX,fovY) 4. use the output of step#3 as GS input.
all code in : #350
Hi @yuedajiong ,
I'll be interested if you could give more details on this point.
I'm trying to use a ground truth path (Augmented ICL NUIM) to reconstruct a scene with 3DGS. As the transforms given with this dataset are from C2W and COLMAP is W2C I applied the transform given with the Nerf data-loading sequence to convert my data (your 3rd step). Also changed the OpenGL camera norm to the COLMAP one with the same conversion. However when rendering my 3DGS scene I can see that my renders are not aligned with my ground truth images.
Is there something I'm missing ? Do you know how to successfully use ground truth data with 3DGS ?
Thanks in advance, Best
@leblond14u
https://github.com/WU-CVGL/MVControl-threestudio/blob/main/app_stage1.py Line: 55,156,208
you can modifiy it, extend 4 views to more.
if the the inputs are azimuth, elevation and distance, istead of R and t , it should be easily understand by human.
some samples camera_path.py.txt
is this enough? if helpful, say thanks to me. :-)
Thanks @yuedajiong I'll check this now :)
I discovered something weird with the guassianRenderer in the 3DGS implementation. I was trying to use the above mentioned dataset and discovered that even using the right transforms (C2W to W2C and change from OpenGL to COLMAP camera), the render does not match the ground truth image.
I guess there's something to do with the origin of the coordinate system or scale of the map. Not sure yet what is causing this issue.
sometimes, official GS has some distant float-points. need more logics to clean these points.
Hi @yuedajiong I see that "xys" and "point3D_ids" are not being used in the code anywhere. Do you know what these should be used for? I only have camera information for "K" multi-view images and would like to create a GS from these.
Hi @sidsunny: Other implementations have too many garbages. Try my code, includes: clear colmap code clearest GS code, include CUDA code even has background-remove code a simple viewer, very lightweight ...
You can see all attributes using: not all attributes are used by GS.
If my code is better, say thanks to me: hahahahaha.
@yuedajiong Thank you for your code! The problem I am facing is that COLMAP is not able to find good feature matches and so it cannot generate camera information. I plan to use the default camera provided with the images but I am not sure if those cameras are pinhole. Also, I do not have the point cloud required for the images so I need to generate that too.
your question/problem:
can not compute the camera pose.
my answer/suggestion:
1. if images are fixed, you can adjust colmap parameters. firstly, please make sure you have used colmap correctly. (main solution)
2. if fail, you can use other RL based pose esimation, such as dust3d or others.
3. in fact, the final perfect algorithm are camera-free(not auto), waiting me, waiting other researchers, and you can try.
AND:
your question/problem: you have not point cloud. my answer/suggestion: the initilized point cloud (from colmap), is just better-to-have, not necessary. you can directly randomize it with uniform dist.