Starting Address record line is always the first line, but in original file it was last line
I am testing this module for some production/testing firmware automation and I am super pleased! I am noticing that when I encode a hex file to a dict with this module, then fetch and re-save to a hex file (the data is stored in a database as a JSON record) one line gets swapped in the file thus causing my md5 checksums to not match. While I am fairly certain the structure of an intel hex file prevents this from being an issue, it would be nice to under stand how this happens.
Encode to database:
file_name = file_path.split('/')[-1]
with open(file_path, 'rb') as file:
md5 = hashlib.md5(
file.read()
).hexdigest()
intel_hex = IntelHex()
intel_hex.loadhex(file_path)
if image_type.lower() in ['mfg', 'manufacturing']:
self.manufacturing_fw = intel_hex.todict()
self.manufacturing_fw_name = file_name
self.manufacturing_md5 = md5
Decode From database:
if image_type.lower() in ['mfg', 'manufacturing']:
output_file_name = f'{out_path}/{self.part_number}-{self.manufacturing_fw_name}'
firmware_data = self.manufacturing_fw
data = {}
for key, value in firmware_data.items():
if key != 'start_addr':
key = int(key)
data[key] = value
intel_hex = IntelHex()
intel_hex.fromdict(data)
intel_hex.write_hex_file(output_file_name)
I need to add these lines upon export to re-arrange the lines to get the exported file to match the original exactly. Not efficient or the minimum solution, but it works with everything I have tested so far.
file = open(output_file_name, 'r')
lines = file.readlines()
file.close()
output = open(output_file_name, 'w')
total_lines = len(lines)
last_line = lines[-1]
lines.append(lines[-1])
lines[0], lines[total_lines-1] = lines[total_lines-1], lines[0]
lines.append(last_line)
for line in lines[1:-1]:
output.write(line)
output.close()
I appreciate any help with this!
Can You provide an example of a file before encode and after decode?
If i understood correctly (from the "fix" code) to have your image written correctly You must put first line at one before last position. I'm guessing that this is a Starting Address record. IntelHex standard doesn't say strictly about position. It can be placed at the beginning, it can be placed at the end (before End of File record). It could be placed between data segments as well. But this is just my theory - honestly I never seen such Hex file. Nevertheless this IntelHex implementation always place Starting Address record at the beginning of the file.
By the way, Your "fix" code can be simplified:
file = open(output_file_name, 'r')
lines = file.readlines()
file.close()
output = open(output_file_name, 'w')
lines.insert(-1, lines[0])
for line in lines[1:]:
output.write(line)
output.close()
Of course if I clearly read out your intentions. But same is what You get when You run your code.
I looked at the IntelHex standard again, and I realized that Starting Address record types are numbered higher then Data and Extended Adress records. So maybe this implementation isn't that good after all. But first let us check the root-cause of your problem.
IntelHex startard allows separate records to be freely swapped until they all within one 64K page IIRC. Additional starting records can appear anywhere in the file. Anyway, in your approach I would not rely on MD5 hashing of binary file. With IntelHex files this is not stable approach. You can use hexdiff.py utility to actually compare content of 2 files, if you wish. Or maybe you can provide some minimal example of your files with different MD5 before and after processing. It's possible that starting address record is to blame here. I don't really need your binary files, you can inspect with text editor first and last lines and show first 8 bytes here. BTW the very last line must be EOF record always. So puting it to the start - sounds very wrong. Without real data it's hard to tell what's wrong actually.
Thanks for looking in to this. Our images are over that size boundary (about 100K), unfortunately I cannot share the hex files here, but I have used hex files with swapped lines in the past when merging in bootloaders in the image. The reasoning behind using the MD5 is because my firmware guy wants that to be used (not my choice). When I diff the files I can see exactly the line that gets swapped. I will get together a small snippet of the beginning and the end of the files to show you what is happening.
Source File on the Right. Reconstructed on the Left
Beginning of file

End of file

Yes, that is the Start Linear Address record all right. As I already wrote. Current implementation always places Starting Address records at beginning. So no matter where it will be placed when importing a file (e.g. using 'loadhex') You will have always Starting Address record at the beginning of the file when exporting to a file (e.g. using 'write_hex_file').
"the data is stored in a database as a JSON record" - why don't you save actual hex file content to database then?
I will change the title of this issue. It can be fixed with another option of write_hex_file, oh my. That's already too complex.
"the data is stored in a database as a JSON record" - why don't you save actual hex file content to database then?
I'm starting to think I should just do that... This module is very nice though. There is talk of changing bootloaders, so being able to merge the files with this module without manually creating new releases seems like nice future proofing
Yes, that is the Start Linear Address record all right. As I already wrote. Current implementation always places Starting Address records at beginning. So no matter where it will be placed when importing a file (e.g. using 'loadhex') You will have always Starting Address record at the beginning of the file when exporting to a file (e.g. using 'write_hex_file').
Im not sure why this is happening, this was compiled and linked in IAR EWARM
Why You have in Your Hex file Start Record on the end? Probably by a software design. @bialix already wrote that there isn't only one place to put that record. IntelHex documentation doesn't describe this precisely.
It can be fixed with another option of write_hex_file, oh my. That's already too complex.
Yeah. This function is really evolving very dynamically ;) I thinking about the solution. I see 2 ways:
- We only care about importing and when loading we only check were it's placed, on beginning or at the end. If somewhere else then I suggest to just force 'on beginning' option in this case. And when writing we set it as-is (resolved while loading)
- Option 1. + new parameter, that would replace 'write_start_addr' (in 'write_hex_file' function) with for example 'start_addr_mode', which would have 3 options:
- 'sof' - start of file, old behavior when 'write_start_addr'=True
- 'eof' - end of file, naturally before EOF record
- 'none' - old behavior when 'write_start_addr'=False