instant-ngp
instant-ngp copied to clipboard
Reloading a new dataset after training on a first one yields a RuntimeError: illegal memory access
Hello again,
I am trying to train a Nerf model on two datasets representing the same scene with different images. I would like to load the first dataset, train Nerf, then load the second dataset (without re-initializating my Nerf model) and train on that second dataset.
The following code snippet returns a RuntimeError:
import numpy as np
import pyngp as ngp # noqa
# -- create model
testbed = ngp.Testbed(ngp.TestbedMode.Nerf)
network = "configs/nerf/base.json"
testbed.reload_network_from_file(network)
aabb_scale=4
scale=0.33333
fx=100
fy=100
cx=100
cy=100
k1=0.0
k2=0.0
p1=0.0
p2=0.0
rgb = np.ones((200,200,4), dtype=np.uint8)
depth = np.ones((200,200), dtype=np.float32)
c2w = np.eye(4)
testbed.create_empty_nerf_dataset(1, aabb_scale, False)
testbed.nerf.training.set_image(0, rgb, depth, scale)
testbed.nerf.training.set_camera_intrinsics(0, fx, fy, cx, cy, k1, k2, p1, p2)
testbed.nerf.training.set_camera_extrinsics(0, c2w[:3,:], True)
testbed.nerf.training.n_images_for_training = 1
testbed.shall_train = True
batch_size=256000
for i in range(200):
testbed.train(batch_size)
testbed.clear_training_data()
# load second dataset
testbed.create_empty_nerf_dataset(2, aabb_scale, False)
for i in range(2):
testbed.nerf.training.set_image(i, rgb, depth, scale)
testbed.nerf.training.set_camera_intrinsics(i, fx, fy, cx, cy, k1, k2, p1, p2)
testbed.nerf.training.set_camera_extrinsics(i, c2w[:3,:], True)
testbed.nerf.training.n_images_for_training = 2
testbed.shall_train = True
batch_size=256000
for i in range(200):
testbed.train(batch_size)
It returns:
Traceback (most recent call last):
File "debug_snippet.py", line 47, in <module>
testbed.train(batch_size)
RuntimeError: ~/instant-ngp/dependencies/tiny-cuda-nn/include/tiny-cuda-nn/gpu_memory.h:285 cudaMemcpy(host_data, data(), num_elements * sizeof(T), cudaMemcpyDeviceToHost)
failed with error an illegal memory access was encountered
Could not free memory: ~/instant-ngp/dependencies/tiny-cuda-nn/include/tiny-cuda-nn/gpu_memory.h:142 cudaFree(rawptr)
failed with error an illegal memory access was encountered
After some debugging I found that the issue might be coming at this line.
I also found that if I call reset() before training a second time things work. However, I do not wish to reset my entire Nerf model (I hope to finetune it on the second dataset).
Do you have any pointers on how I can achieve this?