nerfstudio icon indicating copy to clipboard operation
nerfstudio copied to clipboard

How to make the background transparent

Open tree8888 opened this issue 2 years ago • 3 comments

Although many issues have proposed methods to make the background transparent, such as adding a mask or using the command - pipeline. model. background_ Color 'random', but I didn't see any specific guidelines telling me exactly what to do.

I already know that the mask has a black and white single channel form. If I add a mask_ Path, 1. How to downscale the masks folder to match images_ 2,Images_ 4, images_ 8 folders (I saw someone using ns-process-data, but I don't know the specific command)? 2. Do I need to manually add mask_paths for each image in the transforms.json file? If there are hundreds of images, this job looks very cumbersome. Is there any convenient method to do this?

If using commands alone, does it mean that there is no need to add mask_path? Does that mean that the background in the images folder should initially be transparent? Following this thinking, I tried this method and the output looked very bad. I was wondering if I missed any key points. image

If someone could guide me, I would be very grateful. I have just started learning to use nerfstudio, and I have read many issues, but I can't understand them.😥

If someone could provide me with your background transparent model dataset to help me better understand the composition of the dataset, I would be very, very grateful.😊

my mind is a bit confused. If there is any mistake in my thinking, please correct me.🥰

tree8888 avatar Oct 31 '23 15:10 tree8888

You can use other models to obtain mask images, and then combine them with RGB images to form a 4-channel transparent image with only the target object. However, nerfstudio does not support 4-channel image input, how should the source code be modified?

biyuefeng avatar Nov 22 '23 07:11 biyuefeng

@tree8888 Can you tried with https://github.com/nerfstudio-project/nerfstudio/pull/2165 first?

note: ns-train nerfacto --pipeline.model.background_color "random" should be enough. example data can be e.g: the lego dataset. (rgba images, no need for the mask_path in transforms.json for now)

hoanhle avatar Aug 10 '24 18:08 hoanhle

transforms.json

My training fails to start with the alpha channel dataset-->
root@024b293b337b:/# ns-train nerfacto --data workspace/data/ --output-dir workspace/outputs/ --pipeline.datamanager.train-num-rays-per-batch 4096 --pipeline.datamanager.eval-num-rays-per-batch 4096 --pipeline.model.background-color "random" --pipeline.model.camera-optimizer.mode "off" --pipeline.datamanager.masks-on-gpu True nerfstudio-data --scale-factor 1.0 --orientation-method none --center-method none --auto-scale-poses False --train-split-fraction 1.0 [20:38:07] Using --data alias for --data.pipeline.datamanager.data train.py:230 ──────────────────────────────────────────────────────── Config ──────────────────────────────────────────────────────── TrainerConfig( _target=<class 'nerfstudio.engine.trainer.Trainer'>, output_dir=PosixPath('workspace/outputs'), method_name='nerfacto', experiment_name=None, project_name='nerfstudio-project', timestamp='2025-09-01_203807', machine=MachineConfig(seed=42, num_devices=1, num_machines=1, machine_rank=0, dist_url='auto', device_type='cuda'), logging=LoggingConfig( relative_log_dir=PosixPath('.'), steps_per_log=10, max_buffer_size=20, local_writer=LocalWriterConfig( _target=<class 'nerfstudio.utils.writer.LocalWriter'>, enable=True, stats_to_track=( <EventName.ITER_TRAIN_TIME: 'Train Iter (time)'>, <EventName.TRAIN_RAYS_PER_SEC: 'Train Rays / Sec'>, <EventName.CURR_TEST_PSNR: 'Test PSNR'>, <EventName.VIS_RAYS_PER_SEC: 'Vis Rays / Sec'>, <EventName.TEST_RAYS_PER_SEC: 'Test Rays / Sec'>, <EventName.ETA: 'ETA (time)'> ), max_log_size=10 ), profiler='basic' ), viewer=ViewerConfig( relative_log_filename='viewer_log_filename.txt', websocket_port=None, websocket_port_default=7007, websocket_host='0.0.0.0', num_rays_per_chunk=32768, max_num_display_images=512, quit_on_train_completion=False, image_format='jpeg', jpeg_quality=75, make_share_url=False, camera_frustum_scale=0.1, default_composite_depth=True ), pipeline=VanillaPipelineConfig( _target=<class 'nerfstudio.pipelines.base_pipeline.VanillaPipeline'>, datamanager=ParallelDataManagerConfig( _target=<class 'nerfstudio.data.datamanagers.parallel_datamanager.ParallelDataManager'>, data=PosixPath('workspace/data'), masks_on_gpu=True, images_on_gpu=False, dataparser=NerfstudioDataParserConfig( _target=<class 'nerfstudio.data.dataparsers.nerfstudio_dataparser.Nerfstudio'>, data=PosixPath('.'), scale_factor=1.0, downscale_factor=None, scene_scale=1.0, orientation_method='none', center_method='none', auto_scale_poses=False, eval_mode='fraction', train_split_fraction=1.0, eval_interval=8, depth_unit_scale_factor=0.001, mask_color=None, load_3D_points=False ), train_num_rays_per_batch=4096, train_num_images_to_sample_from=-1, train_num_times_to_repeat_images=-1, eval_num_rays_per_batch=4096, eval_num_images_to_sample_from=-1, eval_num_times_to_repeat_images=-1, eval_image_indices=(0,), collate_fn=<function nerfstudio_collate at 0x7fd62342a200>, camera_res_scale_factor=1.0, patch_size=1, camera_optimizer=None, pixel_sampler=PixelSamplerConfig( _target=<class 'nerfstudio.data.pixel_samplers.PixelSampler'>, num_rays_per_batch=4096, keep_full_image=False, is_equirectangular=False, ignore_mask=False, fisheye_crop_radius=None, rejection_sample_mask=True, max_num_iterations=100 ), num_processes=1, queue_size=2, max_thread_workers=None ), model=NerfactoModelConfig( _target=<class 'nerfstudio.models.nerfacto.NerfactoModel'>, enable_collider=True, collider_params={'near_plane': 2.0, 'far_plane': 6.0}, loss_coefficients={'rgb_loss_coarse': 1.0, 'rgb_loss_fine': 1.0}, eval_num_rays_per_chunk=32768, prompt=None, near_plane=0.05, far_plane=1000.0, background_color='random', hidden_dim=64, hidden_dim_color=64, hidden_dim_transient=64, num_levels=16, base_res=16, max_res=2048, log2_hashmap_size=19, features_per_level=2, num_proposal_samples_per_ray=(256, 96), num_nerf_samples_per_ray=48, proposal_update_every=5, proposal_warmup=5000, num_proposal_iterations=2, use_same_proposal_network=False, proposal_net_args_list=[ {'hidden_dim': 16, 'log2_hashmap_size': 17, 'num_levels': 5, 'max_res': 128, 'use_linear': False}, {'hidden_dim': 16, 'log2_hashmap_size': 17, 'num_levels': 5, 'max_res': 256, 'use_linear': False} ], proposal_initial_sampler='piecewise', interlevel_loss_mult=1.0, distortion_loss_mult=0.002, orientation_loss_mult=0.0001, pred_normal_loss_mult=0.001, use_proposal_weight_anneal=True, use_appearance_embedding=True, use_average_appearance_embedding=True, proposal_weights_anneal_slope=10.0, proposal_weights_anneal_max_num_iters=1000, use_single_jitter=True, predict_normals=False, disable_scene_contraction=False, use_gradient_scaling=False, implementation='tcnn', appearance_embed_dim=32, average_init_density=0.01, camera_optimizer=CameraOptimizerConfig( _target=<class 'nerfstudio.cameras.camera_optimizers.CameraOptimizer'>, mode='off', trans_l2_penalty=0.01, rot_l2_penalty=0.001, optimizer=None, scheduler=None ) ) ), optimizers={ 'proposal_networks': { 'optimizer': AdamOptimizerConfig( _target=<class 'torch.optim.adam.Adam'>, lr=0.01, eps=1e-15, max_norm=None, weight_decay=0 ), 'scheduler': ExponentialDecaySchedulerConfig( _target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>, lr_pre_warmup=1e-08, lr_final=0.0001, warmup_steps=0, max_steps=200000, ramp='cosine' ) }, 'fields': { 'optimizer': AdamOptimizerConfig( _target=<class 'torch.optim.adam.Adam'>, lr=0.01, eps=1e-15, max_norm=None, weight_decay=0 ), 'scheduler': ExponentialDecaySchedulerConfig( _target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>, lr_pre_warmup=1e-08, lr_final=0.0001, warmup_steps=0, max_steps=200000, ramp='cosine' ) }, 'camera_opt': { 'optimizer': AdamOptimizerConfig( _target=<class 'torch.optim.adam.Adam'>, lr=0.001, eps=1e-15, max_norm=None, weight_decay=0 ), 'scheduler': ExponentialDecaySchedulerConfig( _target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>, lr_pre_warmup=1e-08, lr_final=0.0001, warmup_steps=0, max_steps=5000, ramp='cosine' ) } }, vis='viewer', data=PosixPath('workspace/data'), prompt=None, relative_model_dir=PosixPath('nerfstudio_models'), load_scheduler=True, steps_per_save=2000, steps_per_eval_batch=500, steps_per_eval_image=500, steps_per_eval_all_images=25000, max_num_iterations=30000, mixed_precision=True, use_grad_scaler=False, save_only_latest_checkpoint=True, load_dir=None, load_step=None, load_config=None, load_checkpoint=None, log_gradients=False, gradient_accumulation_steps={}, start_paused=False ) ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Saving config to: workspace/outputs/data/nerfacto/2025-09-01_203807/config.yml experiment_config.py:136 Saving checkpoints to: workspace/outputs/data/nerfacto/2025-09-01_203807/nerfstudio_models trainer.py:142 Auto image downscale factor of 1 nerfstudio_dataparser.py:484 Started threads Loading data batch ━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18% 0:00:04╭─────────────── viser ───────────────╮ │ ╷ │ │ HTTP │ http://0.0.0.0:7007 │ │ Websocket │ ws://0.0.0.0:7007 │ │ ╵ │ ╰─────────────────────────────────────╯ Loading data batch ━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━ 41% 0:00:04[NOTE] Not running eval iterations since only viewer is enabled. Use --vis {wandb, tensorboard, viewer+wandb, viewer+tensorboard} to run with eval. No Nerfstudio checkpoint to load, so training from scratch. Disabled comet/tensorboard/wandb event writers Loading data batch ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:07 (viser) Connection opened (0, 1 total), 1096 persistent messages (viser) Connection closed (0, 0 total) ^CProcess ForkProcess-6: Process ForkProcess-5: Process ForkProcess-2: Process ForkProcess-4: Process ForkProcess-8: Process ForkProcess-1: Process ForkProcess-3: Process ForkProcess-7: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/nerfstudio/scripts/train.py", line 189, in launch main_func(local_rank=0, world_size=world_size, config=config) File "/usr/local/lib/python3.10/dist-packages/nerfstudio/scripts/train.py", line 100, in train_loop trainer.train() File "/usr/local/lib/python3.10/dist-packages/nerfstudio/engine/trainer.py", line 266, in train loss, loss_dict, metrics_dict = self.train_iteration(step) File "/usr/local/lib/python3.10/dist-packages/nerfstudio/utils/profiler.py", line 111, in inner out = func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/nerfstudio/engine/trainer.py", line 502, in train_iteration _, loss_dict, metrics_dict = self.pipeline.get_train_loss_dict(step=step) File "/usr/local/lib/python3.10/dist-packages/nerfstudio/utils/profiler.py", line 111, in inner out = func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/nerfstudio/pipelines/base_pipeline.py", line 299, in get_train_loss_dict ray_bundle, batch = self.datamanager.next_train(step) File "/usr/local/lib/python3.10/dist-packages/nerfstudio/data/datamanagers/parallel_datamanager.py", line 291, in next_train bundle, batch = self.data_queue.get() File "/usr/local/lib/python3.10/dist-packages/multiprocess/queues.py", line 106, in get res = self._recv_bytes() File "/usr/local/lib/python3.10/dist-packages/multiprocess/connection.py", line 219, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/local/lib/python3.10/dist-packages/multiprocess/connection.py", line 417, in _recv_bytes buf = self._recv(4) File "/usr/local/lib/python3.10/dist-packages/multiprocess/connection.py", line 382, in _recv chunk = read(handle, remaining) KeyboardInterrupt

Printing profiling stats, from longest to shortest duration in seconds Trainer.train_iteration: 345.9687
VanillaPipeline.get_train_loss_dict: 345.9350

Any suggestions @hoanhle @biyuefeng

Sgt-Hashtag avatar Sep 01 '25 21:09 Sgt-Hashtag