reduce risk of corrupt radio.yml #1412
Fixes #1412
Summary of changes: Adds CRC16 checksum to radio.yml when saved. Adds save to temporary file, folowed by rename to radio.yml Adds check on load, with fall-back to temporary file
I guess the test will also fail on the coming updated PR : https://github.com/EdgeTX/edgetx/issues/1412
Should I modify the radio.config used for the tests? Or is that provided by the test action setup? In any case, if neither checksum nor "manuallyEdited" is present in the file, loading will result in an alert - that causes the test to fail. That needs to be addressed somehow :-)
Changed to use an enum class for the checksum return values.
Just out of curiosity.
How does it prevent file corruption? AS far as i unterstand it should error out only on missing Checksum, right? If checksum is present but changed, how do we know that the user didn´t change anything? And how do we know that the content of the yml and therefore the new checksum is correct before we overwrite the old probably correct file?.
i´m probably missing something, really just curious
It tries to detect if radio.yml was not fully written to flash, and then reverts to the alternative file. When radio.yml is written it is first saved under the alternative name, then renamed. So the checksum allows to detect corruption, and gives the SW one shot in reverting to a valid file.
For the user editing manually, there is a YAML variable (manuellyEdited) that makes the SW skip checksum check at load.
yeah that´s the part i get. but the checksum would also change if you change something in radio configuration e.g not manually editing the file. Wouldn´t the checksum check fail at that point? or how do we check if the file is corrupted this time. I mean if you change somethin in the radio config and a write is triggered, And unlucky as we are the write is corrupted. How does the checksum prevent us from overwriting the old file. SInce as far as i understand it can´t know that the new checksum is not correct ?
It writes the new configuration to a different name and only after completely writing, the old file is replaced. So that, at all times, a good working config file should be there.
yeah but how do we know it´s a good working file.
If we change the radio config the checksum changes aswell if it writes the temp file and the write is corrupted it won´t realise the file is corrupted because its valid that the checksum missmatches because we changed some config entrys. And will probably overwrite the non corrupted file.
The checksum is regenerated with every config change and written as part of the file. It is not static. A config file must either contain the manually edited flag or a correct checksum, that matches the file contents.
so the checksum is generated before we write and we would know if the write corrupted if we check the file content checksum before we rename.
ok get it. thanks for the clarification 👍
We don't do a finishing read-back before rename at this point, but that could easily be added. However, do we want to make save take double the time? On all radios?
I do not think, that we need this. If someone wnats to have additional safety here, I would prefer to reneame the old file and then the new file after each write, so that we have the old one as a backup. But the cases where the config is corrupted should be really rare, except when debugging and constantly resetting the radio. In my opinion we are good.
I will test this PR at the weekend
-rw-r--r-- 1 xxx xxx 2900 Jan 28 2023 radio_error.yml
-rw-r--r-- 1 xxx xxx 2933 Jan 29 2022 radio_new.yml
-rw-r--r-- 1 xxx xxx 2329 Jan 29 2022 radio.yml
I get this after calibration. The old config is still there and the new one is written, but not renamed. Do I have to do something speacial?
Ignore the strange dates, please
Did you test this on a reald radio? Maybe renaming when there already is a file does not work.
Nice find - thanks.
I am currently down with Corona - hopefully I am back in cation within a few days :-/
I am currently down with Corona - hopefully I am back in cation within a few days :-/
I hope you wil recover quick and without any permanent problems. There is no time preasure, so dion't worry about this.
I am currently down with Corona - hopefully I am back in cation within a few days :-/
Take care mate!
I will take a look and test it within the next days. Thank you very much for your work.
This works much better now. One thing I found is, that an old config file will trigger an unrecoverable error and reset the configuration. It probably make sense to check, if there is a checksum at the beginning, and if not, but the file has a reasonable size, load it anyways. Or we need a new storage file version to detect this. @pfeerick, @raphaelcoeffic what do you think?
also the companion needs to be checksum aware
@stephenaa I rebased the PR
Should I add checksum support to companion?
I see the issue with old config files, but I am afraid it is non-trivial to set up rules to detect if the config file has been corrupted and not just an old file.If we have different version (and do a conversion?) the checksum of course should just be added when that is performed. Anyhow, this is just a problem just after update - is there another way of detecting that it is the first load of radio settings after a SW upgrade?
It would be great if you could add this to companion. In the first step I forgot, because I don't really use companion.
I think the storage version is the only way to detect if the file needs to be updated
It would be great if you could add this to companion. In the first step I forgot, because I don't really use companion.
I think the storage version is the only way to detect if the file needs to be updated
I'am on it :-) I was wondering, what should companion behaviour be if a radio.yml is opened, but checksum is wrong? Can we rely on companion to create a valid settings file from scratch? This overlaps a bit with #1521 IMHO.
For the other issue I have been thinking a bit more. Absence of a value "checksum" is only a problem if it is not on the first line, that is - if there is no 'checksum' value, and there is something sensible in the file (and the length is realistic as you suggested earlier) the file is probably (nearly 100%) an old file from before the checksum check was added.
I found a way around the newline issue. In reality it is only needed when ignoring the checksum field added as the first line of the file. By rearranging the logic I managed to minimize the amount of code exposed to the variability.
Also changed the radio firmware to accept files with no checksum field if their size is >25 bytes. I can't really see a better way right now to qualify an "old file", that is a file from before the checksum field was added. Maybe it should be kept in mind for a future file version version bump?
Also changed the radio firmware to accept files with no checksum field if their size is >25 bytes. I can't really see a better way right now to qualify an "old file", that is a file from before the checksum field was added. Maybe it should be kept in mind for a future file version version bump?
I think the limit could be a lot higher than 25 bytes. Other than that, the code looks good, I will test it tomorrow.
Keep in mind that the same code reads model YAMl, so the minimum size could have to be the minimum model file if we are to extend checksum to models.
Sendt fra min iPhone
Den 19. feb. 2022 kl. 20.08 skrev Malte Langermann @.***>:
Also changed the radio firmware to accept files with no checksum field if their size is >25 bytes. I can't really see a better way right now to qualify an "old file", that is a file from before the checksum field was added. Maybe it should be kept in mind for a future file version version bump?
I think the limit could be a lot higher than 25 bytes. Other than that, the code looks good, I will test it tomorrow.
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.
I think the minimum model ist larger, but that as a question for @raphaelcoeffic . Reagrading checksums on models, I am not sure. Models are not written as often. On the other hand a warning, when a model might be corrupted doesnÄt sound too bad. But this would be a task for another PR.
I think the minimum model ist larger, but that as a question for @raphaelcoeffic . Reagrading checksums on models, I am not sure. Models are not written as often. On the other hand a warning, when a model might be corrupted doesnÄt sound too bad. But this would be a task for another PR.
Let's leave the model part for now. Since the function is not called with a pointer for the CRC in case of model files it is a non-issue for now.
I found an issue. When there is no radio.yml, but a radio.bin and a radio_new.yml a new bin->yaml conversion is started
So that conversion is triggered before reading of radio.yml of course. The check should probarbly include “radio_new.yml” as Well?
Sendt fra min iPhone
Den 20. feb. 2022 kl. 17.06 skrev Malte Langermann @.***>:
I found an issue. When there is no radio.yml, but a radio.bin and a radio_new.yml a new bin->yaml conversion is started
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.
So that conversion is triggered before reading of radio.yml of course. The check should probarbly include “radio_new.yml” as Well?
Yes, that should fix that problem.