Issue with 'prioritize_capture_over_reprocess' option and behavior change
I just noticed that an option "prioritize_capture_over_reprocess" was introduced (in July/2024), and it has default value set to "false". I expected it to be "true" by default, since the behavior used to be "continue capturing if it's time to capture and process unprocessed folder later". Was it intentional for some reason?
Also, if RMS is restarted after 4 hours of capture, when it returns it's reprocessing that 4 hours of captures instead of continue the previous capture session. As an example, I rebooted a server responsible for 4 cameras around 03:30 UTC and 45 minutes later the four stations are still reprocessing the first part of the night and didn't start capturing. Log from one station:
025/02/10 03:30:39-INFO-StartCapture-line:1086 - Next start time: True UTC
2025/02/10 03:30:39-DEBUG-StartCapture-line:767 - Checking for folders containing partially-processed data
2025/02/10 03:30:40-DEBUG-StartCapture-line:792 - Checking folder: BR000M_20250126_222641_283847
2025/02/10 03:30:40-DEBUG-StartCapture-line:838 - ... fully processed!
2025/02/10 03:30:40-DEBUG-StartCapture-line:792 - Checking folder: BR000M_20250127_222620_353620
2025/02/10 03:30:40-DEBUG-StartCapture-line:838 - ... fully processed!
2025/02/10 03:30:40-DEBUG-StartCapture-line:792 - Checking folder: BR000M_20250206_032543_408652
2025/02/10 03:30:40-DEBUG-StartCapture-line:838 - ... fully processed!
2025/02/10 03:30:40-DEBUG-StartCapture-line:792 - Checking folder: BR000M_20250207_222050_684633
2025/02/10 03:30:40-DEBUG-StartCapture-line:838 - ... fully processed!
2025/02/10 03:30:40-DEBUG-StartCapture-line:792 - Checking folder: BR000M_20250208_021456_565101
2025/02/10 03:30:40-DEBUG-StartCapture-line:838 - ... fully processed!
2025/02/10 03:30:40-DEBUG-StartCapture-line:792 - Checking folder: BR000M_20250209_234320_627108
2025/02/10 03:30:40-INFO-StartCapture-line:842 - Found partially-processed data in /root/RMS_data/BR000M/CapturedFiles/BR000M_20250209_234320_627108
2025/02/10 03:30:40-INFO-DetectStarsAndMeteors-line:240 - Starting detection...
2025/02/10 03:30:54-INFO-QueuedPool-line:204 - Loaded 1232 backed up results...
2025/02/10 03:30:54-INFO-QueuedPool-line:204 - Using 11 cores
.
.
.
.
.
2025/02/10 04:15:18-INFO-QueuedPool-line:204 - -----
2025/02/10 04:15:18-INFO-QueuedPool-line:204 - Cores in use: 11
2025/02/10 04:15:18-INFO-QueuedPool-line:204 - Active worker threads: 11
2025/02/10 04:15:18-INFO-QueuedPool-line:204 - Idle worker threads: 0
2025/02/10 04:15:18-INFO-QueuedPool-line:204 - Total jobs: 1118
2025/02/10 04:15:18-INFO-QueuedPool-line:204 - Finished jobs: 735
.
.
Even with "prioritize_capture_over_reprocess=false", shouldn't the current night continue after a reboot?
I'm not sure why the default is false - I also think it should be true by default. Can you confirm the 45 minutes and counting processing is with the prioritize_capture_over_reprocess: false , correct? If so, that would be the expected behavior. Luc
I'm not sure why the default is false - I also think it should be true by default. Can you confirm the 45 minutes and counting processing is with the prioritize_capture_over_reprocess: false , correct? If so, that would be the expected behavior. Luc
yes, it is with "prioritize_capture_over_reprocess: false", but isn't a previous night folder, it's the current night folder. I expect it to continue the capture session after a reboot instead of reprocess a night that didn't finish yet.
RMS used to have a time limit to decide if it will continue the previous capture session or "start" a new one. So in a quick reboot it used to load the previous files, add them to detect queue and continue capturing right away to minimize the gap in the night's data. In the other way (as it is today), a reprocessing after reboot in a raspberry pi 3 in the middle of the night will take the rest of the night...
I think it is something around here.
This is the latest version of the code
https://github.com/CroatianMeteorNetwork/RMS/blob/0e75238ff218b382869d24e85393cd437d797ef9/RMS/StartCapture.py#L1184
It used to be wrapped in the following conditional, which prevented running if capture should be in progress.
https://github.com/CroatianMeteorNetwork/RMS/blame/1c30fe4dfebae30cdc54a0bbdddb5e1d26ff9a7e/RMS/StartCapture.py#L1145
@MaadhyamRana, could you take a look at this issue? Thank you.
"I just noticed that an option "prioritize_capture_over_reprocess" was introduced (in July/2024), and it has default value set to "false". I expected it to be "true" by default,"
It was set to default to false so that the new code would not change the behaviour of stations, unless people expressly wanted the change to be made.
It was set to default to false so that the new code would not change the behaviour of stations, unless people expressly wanted the change to be made.
Actually the behavior used to be always prioritize captures. The automatic reprocess of unprocessed folders didn't run if it was capture time or only few hours before capture start time. At least I assured this was the behavior when I added auto reprocess 5 years ago.
I didn't notice the behavior change until now but I think it should be revisited. As I explained, with "false" when a raspberry pi reboots in the middle of night it will decline to capture and might spent the remaining night time processing the night's first leg or processing previously unprocessed folders. I think it should prefer to collect data by default, as those folders can be processed later during the day.
https://github.com/CroatianMeteorNetwork/RMS/issues/171#issue-1708747011
There are two separate pieces of code here.
The prioritize_capture_over_reprocess only applies when broken directories are detected on start and capture should not have started. This option prevents reprocessing of multiple directories taking many hours, or even days to complete, because with this enabled, no more directories will be started if capture should be running. Before, once the reprocess loop started, it would reprocess until all folders had been fixed, preventing capture from starting. I don't think the prioritize_capture_over_reprocess option is causing the reprocessing here.
A recent change has removed the check to prevent reprocessing of a broken directory starting when capture should already be in progress, and I am seeing this problem also, and think it should be revisited. At night, I don't want reprocessing, I want capturing.
https://github.com/CroatianMeteorNetwork/RMS/blob/1c30fe4dfebae30cdc54a0bbdddb5e1d26ff9a7e/RMS/StartCapture.py#L1145
This check seems to have been removed in a recent commit. I'm happy to submit a pr to fix this, probably in the next 36 hours, but I'd like to understand why this check was removed.
The check regarding that conditional was moved in #474. The goal was that when continuous_capture: true, any leftover processing (if config.auto_reprocess is set) would occur before capture start.
Disabling config.auto_reprocess would temporarily resolve this issue, though I'm currently looking into restoring the code so that it works as desired in both continuous and standard (night only) capture modes.
When config.auto_reprocess is set, processIncompleteCaptures is called. This function goes over captured night directories and reprocesses incomplete ones. After each directory that is processed, there is a check to see if capture should have started as shown below. This is where 'prioritize_capture_over_reprocess' comes in - as an exit condition for this function:
https://github.com/CroatianMeteorNetwork/RMS/blob/8726b76fdfa66902f030b63cc0c8c475119e92a7/RMS/StartCapture.py#L871-L878
Note that this check is done after reprocessing at least one incomplete night directory, leading to the behavior we're seeing in this issue here.
@MaadhyamRana , right, so is "prioritize_capture_over_reprocess" a flag meant to be used only when "continuous_capture: true" or with this set in mind? In this case now I understand the reason behind "prioritize_capture_over_reprocess: false" by default: on continuous capture mode there's no stop time and restart to reprocess the remaining folders. If it's the case, the flag name is misleading and conflicts with "auto_reprocess" and the previous existing behavior .
I think it could be simplified. I'd suggest removing the flag "prioritize_capture_over_reprocess", restoring the original behavior (@g7gpr pointed out what seems to be a regression) and follow this behavior table:
| continuous_capture | auto_reprocess | Expected Behavior |
|---|---|---|
| false | false | never reprocess incomplete folders |
| false | true | reprocess capture folders when RMS is idle, not capturing nor near capture start (default setting) |
| true | false | never reprocess incomplete folders |
| true | true | reprocess capture folders during startup, at first chance, before starting a session, no matter if it's day or night (or maybe prioritize captures if it's night but looks complex, RMS would need schedule a stop during the day light to reprocess incomplete folders ) |
The "continuous_capture: true" users would achieve the currently non default "prioritize_capture_over_reprocess: true" behavior using "auto_reprocess: false"
Does it make sense?
It makes sense, however, there needs to be a check after reprocessing each directory that capture is not about to start, rather than one check before starting to reprocess all directories.
This is to prevent multiple corrupt directories preventing capture from running.
prioritize_capture_over_reprocess: false is the default, because it retains the original behaviour. Once reprocessing of incompletely processed directories starts, all incompletely processed directories will be processed, even if this prevents capture from starting.
Setting prioritize_capture_over_reprocess: true means that after each incompletely processed directory has been processed a check is made to see if capture should have started. If it should have done, then no more incompletely processed directories are processed and capture starts. More incompletely processed directories are processed the next day.
I do think there is scope for making this default to true, but this would have changed the behaviour of the station without any input from the operator, which is why I set the default as false.
Can you please look at 524 and make it true for this case? The -r switch can only be used intentionally and because the capture resume is needed for whatever reason. Therefore the prior reprocess would negate the resume.
@satmonkey, I agree, but this is a different piece of code to prioritize_capture_over_reprocess. I can look at it, but I think it's already been worked on. However if someone wants to assign this to me, then I'll gladly take it on.
I think this has been resolved, could we close this issue?