How to make the background transparent
Although many issues have proposed methods to make the background transparent, such as adding a mask or using the command - pipeline. model. background_ Color 'random', but I didn't see any specific guidelines telling me exactly what to do.
I already know that the mask has a black and white single channel form. If I add a mask_ Path, 1. How to downscale the masks folder to match images_ 2,Images_ 4, images_ 8 folders (I saw someone using ns-process-data, but I don't know the specific command)? 2. Do I need to manually add mask_paths for each image in the transforms.json file? If there are hundreds of images, this job looks very cumbersome. Is there any convenient method to do this?
If using commands alone, does it mean that there is no need to add mask_path? Does that mean that the background in the images folder should initially be transparent? Following this thinking, I tried this method and the output looked very bad. I was wondering if I missed any key points.
If someone could guide me, I would be very grateful. I have just started learning to use nerfstudio, and I have read many issues, but I can't understand them.😥
If someone could provide me with your background transparent model dataset to help me better understand the composition of the dataset, I would be very, very grateful.😊
my mind is a bit confused. If there is any mistake in my thinking, please correct me.🥰
You can use other models to obtain mask images, and then combine them with RGB images to form a 4-channel transparent image with only the target object. However, nerfstudio does not support 4-channel image input, how should the source code be modified?
@tree8888 Can you tried with https://github.com/nerfstudio-project/nerfstudio/pull/2165 first?
note: ns-train nerfacto --pipeline.model.background_color "random" should be enough. example data can be e.g: the lego dataset. (rgba images, no need for the mask_path in transforms.json for now)
My training fails to start with the alpha channel dataset-->
root@024b293b337b:/# ns-train nerfacto --data workspace/data/ --output-dir workspace/outputs/ --pipeline.datamanager.train-num-rays-per-batch 4096 --pipeline.datamanager.eval-num-rays-per-batch 4096 --pipeline.model.background-color "random" --pipeline.model.camera-optimizer.mode "off" --pipeline.datamanager.masks-on-gpu True nerfstudio-data --scale-factor 1.0 --orientation-method none --center-method none --auto-scale-poses False --train-split-fraction 1.0
[20:38:07] Using --data alias for --data.pipeline.datamanager.data train.py:230
──────────────────────────────────────────────────────── Config ────────────────────────────────────────────────────────
TrainerConfig(
_target=<class 'nerfstudio.engine.trainer.Trainer'>,
output_dir=PosixPath('workspace/outputs'),
method_name='nerfacto',
experiment_name=None,
project_name='nerfstudio-project',
timestamp='2025-09-01_203807',
machine=MachineConfig(seed=42, num_devices=1, num_machines=1, machine_rank=0, dist_url='auto', device_type='cuda'),
logging=LoggingConfig(
relative_log_dir=PosixPath('.'),
steps_per_log=10,
max_buffer_size=20,
local_writer=LocalWriterConfig(
_target=<class 'nerfstudio.utils.writer.LocalWriter'>,
enable=True,
stats_to_track=(
<EventName.ITER_TRAIN_TIME: 'Train Iter (time)'>,
<EventName.TRAIN_RAYS_PER_SEC: 'Train Rays / Sec'>,
<EventName.CURR_TEST_PSNR: 'Test PSNR'>,
<EventName.VIS_RAYS_PER_SEC: 'Vis Rays / Sec'>,
<EventName.TEST_RAYS_PER_SEC: 'Test Rays / Sec'>,
<EventName.ETA: 'ETA (time)'>
),
max_log_size=10
),
profiler='basic'
),
viewer=ViewerConfig(
relative_log_filename='viewer_log_filename.txt',
websocket_port=None,
websocket_port_default=7007,
websocket_host='0.0.0.0',
num_rays_per_chunk=32768,
max_num_display_images=512,
quit_on_train_completion=False,
image_format='jpeg',
jpeg_quality=75,
make_share_url=False,
camera_frustum_scale=0.1,
default_composite_depth=True
),
pipeline=VanillaPipelineConfig(
_target=<class 'nerfstudio.pipelines.base_pipeline.VanillaPipeline'>,
datamanager=ParallelDataManagerConfig(
_target=<class 'nerfstudio.data.datamanagers.parallel_datamanager.ParallelDataManager'>,
data=PosixPath('workspace/data'),
masks_on_gpu=True,
images_on_gpu=False,
dataparser=NerfstudioDataParserConfig(
_target=<class 'nerfstudio.data.dataparsers.nerfstudio_dataparser.Nerfstudio'>,
data=PosixPath('.'),
scale_factor=1.0,
downscale_factor=None,
scene_scale=1.0,
orientation_method='none',
center_method='none',
auto_scale_poses=False,
eval_mode='fraction',
train_split_fraction=1.0,
eval_interval=8,
depth_unit_scale_factor=0.001,
mask_color=None,
load_3D_points=False
),
train_num_rays_per_batch=4096,
train_num_images_to_sample_from=-1,
train_num_times_to_repeat_images=-1,
eval_num_rays_per_batch=4096,
eval_num_images_to_sample_from=-1,
eval_num_times_to_repeat_images=-1,
eval_image_indices=(0,),
collate_fn=<function nerfstudio_collate at 0x7fd62342a200>,
camera_res_scale_factor=1.0,
patch_size=1,
camera_optimizer=None,
pixel_sampler=PixelSamplerConfig(
_target=<class 'nerfstudio.data.pixel_samplers.PixelSampler'>,
num_rays_per_batch=4096,
keep_full_image=False,
is_equirectangular=False,
ignore_mask=False,
fisheye_crop_radius=None,
rejection_sample_mask=True,
max_num_iterations=100
),
num_processes=1,
queue_size=2,
max_thread_workers=None
),
model=NerfactoModelConfig(
_target=<class 'nerfstudio.models.nerfacto.NerfactoModel'>,
enable_collider=True,
collider_params={'near_plane': 2.0, 'far_plane': 6.0},
loss_coefficients={'rgb_loss_coarse': 1.0, 'rgb_loss_fine': 1.0},
eval_num_rays_per_chunk=32768,
prompt=None,
near_plane=0.05,
far_plane=1000.0,
background_color='random',
hidden_dim=64,
hidden_dim_color=64,
hidden_dim_transient=64,
num_levels=16,
base_res=16,
max_res=2048,
log2_hashmap_size=19,
features_per_level=2,
num_proposal_samples_per_ray=(256, 96),
num_nerf_samples_per_ray=48,
proposal_update_every=5,
proposal_warmup=5000,
num_proposal_iterations=2,
use_same_proposal_network=False,
proposal_net_args_list=[
{'hidden_dim': 16, 'log2_hashmap_size': 17, 'num_levels': 5, 'max_res': 128, 'use_linear': False},
{'hidden_dim': 16, 'log2_hashmap_size': 17, 'num_levels': 5, 'max_res': 256, 'use_linear': False}
],
proposal_initial_sampler='piecewise',
interlevel_loss_mult=1.0,
distortion_loss_mult=0.002,
orientation_loss_mult=0.0001,
pred_normal_loss_mult=0.001,
use_proposal_weight_anneal=True,
use_appearance_embedding=True,
use_average_appearance_embedding=True,
proposal_weights_anneal_slope=10.0,
proposal_weights_anneal_max_num_iters=1000,
use_single_jitter=True,
predict_normals=False,
disable_scene_contraction=False,
use_gradient_scaling=False,
implementation='tcnn',
appearance_embed_dim=32,
average_init_density=0.01,
camera_optimizer=CameraOptimizerConfig(
_target=<class 'nerfstudio.cameras.camera_optimizers.CameraOptimizer'>,
mode='off',
trans_l2_penalty=0.01,
rot_l2_penalty=0.001,
optimizer=None,
scheduler=None
)
)
),
optimizers={
'proposal_networks': {
'optimizer': AdamOptimizerConfig(
_target=<class 'torch.optim.adam.Adam'>,
lr=0.01,
eps=1e-15,
max_norm=None,
weight_decay=0
),
'scheduler': ExponentialDecaySchedulerConfig(
_target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>,
lr_pre_warmup=1e-08,
lr_final=0.0001,
warmup_steps=0,
max_steps=200000,
ramp='cosine'
)
},
'fields': {
'optimizer': AdamOptimizerConfig(
_target=<class 'torch.optim.adam.Adam'>,
lr=0.01,
eps=1e-15,
max_norm=None,
weight_decay=0
),
'scheduler': ExponentialDecaySchedulerConfig(
_target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>,
lr_pre_warmup=1e-08,
lr_final=0.0001,
warmup_steps=0,
max_steps=200000,
ramp='cosine'
)
},
'camera_opt': {
'optimizer': AdamOptimizerConfig(
_target=<class 'torch.optim.adam.Adam'>,
lr=0.001,
eps=1e-15,
max_norm=None,
weight_decay=0
),
'scheduler': ExponentialDecaySchedulerConfig(
_target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>,
lr_pre_warmup=1e-08,
lr_final=0.0001,
warmup_steps=0,
max_steps=5000,
ramp='cosine'
)
}
},
vis='viewer',
data=PosixPath('workspace/data'),
prompt=None,
relative_model_dir=PosixPath('nerfstudio_models'),
load_scheduler=True,
steps_per_save=2000,
steps_per_eval_batch=500,
steps_per_eval_image=500,
steps_per_eval_all_images=25000,
max_num_iterations=30000,
mixed_precision=True,
use_grad_scaler=False,
save_only_latest_checkpoint=True,
load_dir=None,
load_step=None,
load_config=None,
load_checkpoint=None,
log_gradients=False,
gradient_accumulation_steps={},
start_paused=False
)
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Saving config to: workspace/outputs/data/nerfacto/2025-09-01_203807/config.yml experiment_config.py:136
Saving checkpoints to: workspace/outputs/data/nerfacto/2025-09-01_203807/nerfstudio_models trainer.py:142
Auto image downscale factor of 1 nerfstudio_dataparser.py:484
Started threads
Loading data batch ━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18% 0:00:04╭─────────────── viser ───────────────╮
│ ╷ │
│ HTTP │ http://0.0.0.0:7007 │
│ Websocket │ ws://0.0.0.0:7007 │
│ ╵ │
╰─────────────────────────────────────╯
Loading data batch ━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━ 41% 0:00:04[NOTE] Not running eval iterations since only viewer is enabled.
Use --vis {wandb, tensorboard, viewer+wandb, viewer+tensorboard} to run with eval.
No Nerfstudio checkpoint to load, so training from scratch.
Disabled comet/tensorboard/wandb event writers
Loading data batch ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:07
(viser) Connection opened (0, 1 total), 1096 persistent messages
(viser) Connection closed (0, 0 total)
^CProcess ForkProcess-6:
Process ForkProcess-5:
Process ForkProcess-2:
Process ForkProcess-4:
Process ForkProcess-8:
Process ForkProcess-1:
Process ForkProcess-3:
Process ForkProcess-7:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/scripts/train.py", line 189, in launch
main_func(local_rank=0, world_size=world_size, config=config)
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/scripts/train.py", line 100, in train_loop
trainer.train()
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/engine/trainer.py", line 266, in train
loss, loss_dict, metrics_dict = self.train_iteration(step)
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/utils/profiler.py", line 111, in inner
out = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/engine/trainer.py", line 502, in train_iteration
_, loss_dict, metrics_dict = self.pipeline.get_train_loss_dict(step=step)
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/utils/profiler.py", line 111, in inner
out = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/pipelines/base_pipeline.py", line 299, in get_train_loss_dict
ray_bundle, batch = self.datamanager.next_train(step)
File "/usr/local/lib/python3.10/dist-packages/nerfstudio/data/datamanagers/parallel_datamanager.py", line 291, in
next_train
bundle, batch = self.data_queue.get()
File "/usr/local/lib/python3.10/dist-packages/multiprocess/queues.py", line 106, in get
res = self._recv_bytes()
File "/usr/local/lib/python3.10/dist-packages/multiprocess/connection.py", line 219, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/local/lib/python3.10/dist-packages/multiprocess/connection.py", line 417, in _recv_bytes
buf = self._recv(4)
File "/usr/local/lib/python3.10/dist-packages/multiprocess/connection.py", line 382, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
Printing profiling stats, from longest to shortest duration in seconds
Trainer.train_iteration: 345.9687
VanillaPipeline.get_train_loss_dict: 345.9350
Any suggestions @hoanhle @biyuefeng