asm-differ icon indicating copy to clipboard operation
asm-differ copied to clipboard

Rewrite `split_off_address` to fix template symbols

Open 1superchip opened this issue 9 months ago • 0 comments

split_off_address splits based on commas which can cause issues with template symbols that MWCC generates. split_off_address is generally called on instructions that are in instructions_with_address_immediates which contains instructions that don't follow the expected format.

The comment in split_off_address is: """Split e.g. 'beqz $r0,1f0' into 'beqz $r0,' and '1f0'.""". split_off_address expects an instruction line that follows this convention of a single comma which separates some data (a register) on the left side and the branch offset on the right side.

Some instructions which are passed to split_off_address do not follow this format (as they contain 1 argument). Instructions that contain 1 argument and are in instructions_with_address_immediates

MIPS

  • jal
  • j
  • bal
  • b

PPC

  • b
  • bl

Most of these instructions include relocations that can break the assumption that they can be split by , due to the symbol containing ,.

A potential solution that was discussed on discord was defining instructions that only contain 1 argument in a set named instructions_with_address_immediates_one_arg per architecture that can be accessed in split_off_address to split mnemonic only_arg. This would require passing the arch variable to split_off_address.

An idea brought up by simon is ignoring commas after a <. All the instructions listed above have a single format of mnemonic arg so there should be no problem with those.

MIPS jal instruction containing a relocation that has a comma in the name: Image

1superchip avatar Mar 19 '25 23:03 1superchip