cars
cars copied to clipboard
How to debug CARS if things don't work well
Hi:
This project is awesome. I've encountered some issues and hope you can help.
-
I am running CARS on a pair of Pleiades stereo imagery and the result is not very optimal. Here is CARS result:
-
In the mean time, I used another open source software to run the same Pleiades data and the result is as follow:
-
Basically, I would like to just use one software and stick to it. And I would like to understand that in this situation, what I can do to debug CARS. I am thinking checking intermediate files since CARS processes a raster by tiles. And I didn't do bundle adjustment before running. Would it help if bundle adjustment is run beforehand?
-
I am expecting to see some intermediate files created during the processing. For example, since a raster is first split into smaller tiles, is it possible to visualize the tiled raster and see what's going on, or check the SIFT matching points?
Thank you.
Here are the trackback:
2022-08-12 10:27:18,991 - distributed.nanny - WARNING - Worker process still alive after 3.9999990463256836 seconds, killing
2022-08-12 10:27:19,043 - distributed.nanny - WARNING - Worker process still alive after 3.999998474121094 seconds, killing
2022-08-12 10:27:19,043 - distributed.nanny - WARNING - Worker process still alive after 3.9999982833862306 seconds, killing
2022-08-12 10:27:19,044 - distributed.nanny - WARNING - Worker process still alive after 3.999998474121094 seconds, killing
2022-08-12 10:27:19,044 - distributed.nanny - WARNING - Worker process still alive after 3.999998474121094 seconds, killing
2022-08-12 10:27:19,044 - distributed.nanny - WARNING - Worker process still alive after 3.999998664855957 seconds, killing
2022-08-12 10:27:19,044 - distributed.nanny - WARNING - Worker process still alive after 3.9999988555908206 seconds, killing
2022-08-12 10:27:19,095 - distributed.nanny - WARNING - Worker process still alive after 3.9999990463256836 seconds, killing
2022-08-12 10:27:19,095 - distributed.nanny - WARNING - Worker process still alive after 3.999998664855957 seconds, killing
2022-08-12 10:27:19,096 - distributed.nanny - WARNING - Worker process still alive after 3.999998664855957 seconds, killing
2022-08-12 10:27:19,096 - distributed.nanny - WARNING - Worker process still alive after 3.9999988555908206 seconds, killing
2022-08-12 10:27:19,149 - distributed.nanny - WARNING - Worker process still alive after 3.999998474121094 seconds, killing
2022-08-12 10:27:19,149 - distributed.nanny - WARNING - Worker process still alive after 3.999998474121094 seconds, killing
2022-08-12 10:27:20,542 - tornado.application - ERROR - Exception in callback functools.partial(<bound method AsyncProcess._on_exit o
f <AsyncProcess Dask Worker process (from Nanny)>>, -15)
Traceback (most recent call last):
File "/cars/venv/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
ret = callback()
File "/cars/venv/lib/python3.8/site-packages/distributed/process.py", line 139, in _on_exit
self._exit_callback(self)
File "/cars/venv/lib/python3.8/site-packages/distributed/nanny.py", line 695, in _on_exit
self.mark_stopped()
File "/cars/venv/lib/python3.8/site-packages/distributed/nanny.py", line 733, in mark_stopped
self.on_exit(r)
File "/cars/venv/lib/python3.8/site-packages/distributed/nanny.py", line 501, in _on_exit_sync
self._ongoing_background_tasks.call_soon(self._on_exit, exitcode)
File "/cars/venv/lib/python3.8/site-packages/distributed/core.py", line 190, in call_soon
raise AsyncTaskGroupClosedError(
distributed.core.AsyncTaskGroupClosedError: Cannot schedule a new coroutine function as the group is already closed.
2022-08-12 10:27:20,543 - tornado.application - ERROR - Exception in callback functools.partial(<bound method AsyncProcess._on_exit o
f <AsyncProcess Dask Worker process (from Nanny)>>, -15)
Traceback (most recent call last):
File "/cars/venv/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
ret = callback()
File "/cars/venv/lib/python3.8/site-packages/distributed/process.py", line 139, in _on_exit
self._exit_callback(self)
File "/cars/venv/lib/python3.8/site-packages/distributed/nanny.py", line 695, in _on_exit
self.mark_stopped()
File "/cars/venv/lib/python3.8/site-packages/distributed/nanny.py", line 733, in mark_stopped
self.on_exit(r)
File "/cars/venv/lib/python3.8/site-packages/distributed/nanny.py", line 501, in _on_exit_sync
self._ongoing_background_tasks.call_soon(self._on_exit, exitcode)
File "/cars/venv/lib/python3.8/site-packages/distributed/core.py", line 190, in call_soon
raise AsyncTaskGroupClosedError(
distributed.core.AsyncTaskGroupClosedError: Cannot schedule a new coroutine function as the group is already closed.
And in compute_dsm_outdir/22-08-12_10h00m_compute_dsm.log
,
22-08-12 10:27:14 :: ERROR :: CARS terminated with following error
Traceback (most recent call last):
File "/cars/cars/cars.py", line 748, in main_cli
run_compute_dsm(args, dry_run)
File "/cars/cars/cars.py", line 680, in run_compute_dsm
compute_dsm.run(
File "/cars/cars/pipelines/compute_dsm.py", line 1172, in run
write_dsm.write_geotiff_dsm(
File "/cars/cars/pipelines/write_dsm.py", line 324, in write_geotiff_dsm
for future, raster_tile in tqdm(
File "/cars/venv/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__
for obj in iterable:
File "/cars/venv/lib/python3.8/site-packages/distributed/client.py", line 4883, in __next__
return self._get_and_raise()
File "/cars/venv/lib/python3.8/site-packages/distributed/client.py", line 4872, in _get_and_raise
raise exc.with_traceback(tb)
distributed.scheduler.KilledWorker: ('images_pair_to_3d_points-13efe864-f5ec-4d74-a056-0964cffb9b57', <WorkerState 'tcp://127.0.0.1:38559', name: 40, status: closed, memory: 0, processing: 11>)
Hi @lionlai1989,
I think your issue is due to instabilities in dask mode. I would recommend you using multiprocessing mode of compute_dsm pipeline :
cars compute_dsm --mode mp --nb_workers 4
Also, we will present CARS 0.5.0 (Be careful, this version changes API and configuration) at FOSS4G workshop soon with information you might be interested in : https://talks.osgeo.org/foss4g-2022-workshops/talk/SFMBCR/. The materials will be available on the github.
We will keep you informed and you can also send me an e-mail for more information : [email protected]. Regards, David
Hey @dyoussef :
Thanks for your reply. I am running CARS with the new API, but I found out that the new cli doesn't support either --mode mp
or --nb_workers 4
arguments. I am wondering how users could provide arguments in new API?
The error:
usage: cars [-h] [--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--version]
conf
cars: error: unrecognized arguments: --nb_workers --mode mp /data/cars_duesseldorf_config.json
And so I run the following command and config without any arguments:
docker run -w /data -v /home/ubuntu/data:/data cnes/cars /data/cars_duessel dorf_config.json
{
"inputs": {
"sensors": {
"one": {
"image": "Duesseldorf/IMG_PHR1A_P_001/IMG_PHR1A_P_201807271102180_SEN_3385144101-1_R1C1.JP2",
"geomodel": "Duesseldorf/IMG_PHR1A_P_001/RPC_PHR1A_P_201807271102180_SEN_3385144101-1.XML",
"no_data": 0
},
"two": {
"image": "Duesseldorf/IMG_PHR1A_P_002/IMG_PHR1A_P_201807271101330_SEN_3385144101-1_R1C1.JP2",
"geomodel": "Duesseldorf/IMG_PHR1A_P_002/RPC_PHR1A_P_201807271101330_SEN_3385144101-1.XML",
"no_data": 0
}
},
"pairing": [
[
"one",
"two"
]
],
"initial_elevation": "srtm_dir"
},
"output": {
"out_dir": "outresults"
}
}
Thank you.
Hi @lionlai1989, with the new API, you have to provide all the arguments through the config.json file
{
"inputs": {
"sensors": {
"one": {
"image": "Duesseldorf/IMG_PHR1A_P_001/IMG_PHR1A_P_201807271102180_SEN_3385144101-1_R1C1.JP2",
"geomodel": "Duesseldorf/IMG_PHR1A_P_001/RPC_PHR1A_P_201807271102180_SEN_3385144101-1.XML",
"no_data": 0
},
"two": {
"image": "Duesseldorf/IMG_PHR1A_P_002/IMG_PHR1A_P_201807271101330_SEN_3385144101-1_R1C1.JP2",
"geomodel": "Duesseldorf/IMG_PHR1A_P_002/RPC_PHR1A_P_201807271101330_SEN_3385144101-1.XML",
"no_data": 0
}
},
"pairing": [
[
"one",
"two"
]
],
"initial_elevation": "srtm_dir"
},
"output": {
"out_dir": "outresults"
},
"orchestrator": {
"mode": "mp",
"nb_workers": 4
}
}
(cf. https://cars.readthedocs.io/en/latest/user_guide/configuration.html#orchestrator)
Regards, David
Hi @dyoussef
I've tried your advice but there is another error and my program just hangs there without running (observed by htop). So eventually I have to use CTRL+C
to terminate the program. Here are the step by step outputs:
- Run the program and it shows the following error.
$ docker run -w /data -v /home/ubuntu/cars/data:/data cnes/cars /data/cars_duesseldorf_confi
g.json
Computing epipolar grids ...: 100% [**************************************************] (5s)
Processing Tiles : [ epi_matches_left ] ...: 0%| | 0/1886 [00:00<?, ?it/s]2022-08-24 07:31:47 (WARNING) EpipolarSparseMatc
hing: Warning: tile ignored because of the following error itk::ExceptionObject (0x7f989c5b3a80)
Location: "unknown"
File: /usr/include/ITK-4.13/itkEuclideanDistanceMetric.hxx
Line: 64
Description: itk::ERROR: EuclideanDistanceMetric(0x7f989c0b7cc0): The two measurement vectors have unequal size (128 and 2618117160)
2022-08-24 07:51:42 (WARNING) EpipolarSparseMatching: Warning: tile ignored because of the following error
itk::ExceptionObject (0x7f989c40c2b0)
Location: "unknown"
File: /usr/include/ITK-4.13/itkEuclideanDistanceMetric.hxx
Line: 64
Description: itk::ERROR: EuclideanDistanceMetric(0x7f989c637a30): The two measurement vectors have unequal size (128 and 1181730355)
Processing Tiles : [ epi_matches_left ] ...: 100%|██████████| 1886/1886 [29:28<00:00, 1.07it/s]
Processing Tiles : [ color , dsm ] ...: 0%| | 0/420 [00:00<?, ?it/s]Terminating: fork() called from a process already usin
g GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe. Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
The program hangs at this point. I observe from htop that no CPU and memory resource is being used.
- Type CTRL+C. The output:
Terminating: fork() called from a process already using GNU OpenMP, this is unsafe.
^C <-- (CTRL+C IS HERE)
22-08-24 13:45:05 :: ERROR :: CARS terminated with following error
Traceback (most recent call last):
File "/cars/venv/lib/python3.8/site-packages/cars/cars.py", line 181, in main_cli
used_pipeline.run()
File "/cars/venv/lib/python3.8/site-packages/cars/pipelines/sensor_to_full_resolution_dsm/sensor_to_full_resolution_dsm_pipeline.py", line 558, in run
_ = self.rasterization_application.run(
File "/cars/venv/lib/python3.8/site-packages/cars/orchestrator/orchestrator.py", line 314, in __exit__
self.compute_futures()
File "/cars/venv/lib/python3.8/site-packages/cars/orchestrator/orchestrator.py", line 253, in compute_futures
for future_obj in self.cluster.future_iterator(future_objects):
File "/cars/venv/lib/python3.8/site-packages/cars/orchestrator/cluster/multiprocessing_cluster.py", line 694, in __next__
if item.ready():
KeyboardInterrupt
Processing Tiles : [ color , dsm ] ...: 0%| | 0/420 [5:52:10<?, ?it/s]
Do you know what might be the problem? Thank you.
The config file:
{
"inputs": {
"sensors": {
"one": {
"image": "Duesseldorf/IMG_PHR1A_P_001/IMG_PHR1A_P_201807271102180_SEN_3385144101-1_R1C1.JP2",
"geomodel": "Duesseldorf/IMG_PHR1A_P_001/RPC_PHR1A_P_201807271102180_SEN_3385144101-1.XML",
"no_data": 0
},
"two": {
"image": "Duesseldorf/IMG_PHR1A_P_002/IMG_PHR1A_P_201807271101330_SEN_3385144101-1_R1C1.JP2",
"geomodel": "Duesseldorf/IMG_PHR1A_P_002/RPC_PHR1A_P_201807271101330_SEN_3385144101-1.XML",
"no_data": 0
}
},
"pairing": [
[
"one",
"two"
]
],
"initial_elevation": "srtm_dir"
},
"output": {
"out_dir": "outresults"
},
"orchestrator": {
"mode": "mp",
"nb_workers": 16
}
}
The spec of my laptop:
- 16 CPUs
- 64 GB memory
Docker image: The latest 0.5.0
Hi @lionlai1989,
Most of the people on the team are on vacation or at FOSS4G this week (me included), so we will analyse your issue next week.
We will keep you informed. Regards, David
Hi @lionlai1989,
https://github.com/CNES/cars/commit/5dd71e60fa8b0006b2a09ce11edc101799be3982 should fix the problem. Can you "docker pull" cnes/cars:0.5.1 and try again ?
Thanks in advance, David
Hi David:
I've tried it again. The result looks great. Thanks a lot.