LinK
LinK copied to clipboard
Problems in the Evaluation and Submission of segmentation
Thanks for this excellent piece of work! I encountered "Warning: could not find environment variable "-x", mpirun was unable to find the specified executable file, and therefore did not launch the job. Warning: could not find environment variable "-x", mpirun was unable to find the specified executable file, and therefore did not launch the job. This error was first reported for process rank 0; it may have occurred for other processes as well", but I have already run the chmod +x evaluate.sh command, and I would like to know how to create a hyperlink to the semantickitti dataset if it is placed in a different location, and is hyperlink necessary?
Hi, thanks for your attention. You may check whether the mpirun is installed correctly according to the instructions. Because there are different versions of mpirun (like open mpi, intel mpi, mpich etc.), and they work in slightly different ways. And, the hyperlink can be created by ln -s stored/path/of/semantickitti data/semantickitti
. Feel free to contact me if you have more questions.
Thank you for such a quick reply, I can now run . /evaluate.sh now, but it doesn't move when it loads to this location, and it doesn't report an error to exit, but it just stops, have you ever encountered this? /root/miniconda3/envs/LinK_seg/bin/python evaluate.py --load_path ../checkpoints/max-iou-val.pt [2024-01-21 08:44:43.634] Experiment started: "runs/run-db770b11". workers_per_gpu: 2 distributed: True amp_enabled: False data: num_classes: 20 ignore_label: 0 training_size: 19132 train: seed: 1588147245 deterministic: False dataset: name: semantic_kitti root: ./data/SemanticKITTI/dataset/sequences num_points: 80000 voxel_size: 0.05 num_epochs: 25 batch_size: 2 model: cr: 1.0 name: linkunet base_op: cos_x r: 2 s: 3 groups: 1 criterion: name: lovasz_softmax ignore_index: 0 optimizer: name: sgd lr: 0.24 weight_decay: 0.0001 momentum: 0.9 nesterov: True scheduler: name: cosine_warmup
And I've already completed the training once, and it didn't stop moving while training.
Well, I've met this before and found several reasons may cause this. You can check: 1. whether the process get stuck in loading data (due to wrong data path); 2. whether the CUDA_VISIBLE_DEVICES is set correctly.