customasm icon indicating copy to clipboard operation
customasm copied to clipboard

Disassembler option?

Open ProxyPlayerHD opened this issue 5 years ago • 4 comments

One thing i would really like is the ability to disassemble existing Binary files. it could be done with the Command prompt like assembling but would require more parameters for example:

customasm -d <bin file to disassemble> <CPU file to describe the Instruction set> <starting address of the Program> <starting address inside the file where it starts to disassemble from (optional)> <end address in the file where it stops disassembling (optional)>

example: customasm -d test.bin 6502.cpu 0xE000

and the format for the file generated could be similar to other Disassemblers:

<Address> <Data> <some space> <Labels> <Instructions> example, this 6502 Disassembler:

E001   A2 FF      LDX #$FF
E003   9A         TXS
E004   A9 00      LDA #$00
E006   AA         TAX
E007   A8         TAY

I know this format already kinda exists for Assembling, but without showing labels (which is kinda a shame) and the formatting sometimes breaks if you have a lot in a single line... plus it doesn't use uppercase only HEX, which is just Heresy.

obviously it would be a lot of work and i'm asking for a lot. but hey maybe it could be a long term goal.

and one last thing, the current version added some kinb of type specifications for values. it would be awesome if those were explained more in the documentation... i didn't see them there atleast.

ProxyPlayerHD avatar Mar 21 '20 08:03 ProxyPlayerHD

This would be incredibly cool, indeed! But with instructions being as free-form as they are now, I imagine it wouldn't be possible to do a reverse mapping so easily in general... Maybe only for a subset of instructions with simple representations?

About the parameter type specifications, please see here while I haven't got the proper documentation done!

hlorenzi avatar Apr 07 '20 14:04 hlorenzi

i think the most difficult part would be dealing with Tokens. as they are the only things that effect the actual instruction mnemonic. and maybe this could be done with an intermediate step that takes the CPU file and gets rid of all tokens before it disassembles a file?

so basically a de-tokenizer. for example, this could be in the input CPU file:

#tokendef FLAGS
    {
        Z	= 0
        C	= 1
        N	= 2
    }

JR {fl:FLAGS} {src}		-> 0xE[3:0] @ fl[3:0] @ (src - pc)[15:0]

and it would take that and tear it apart into it's seperate possibilites:

JR Z {src}			-> 0xE0[7:0] @ (src - pc)[15:0]
JR C {src}			-> 0xE1[7:0] @ (src - pc)[15:0]
JR N {src}			-> 0xE2[7:0] @ (src - pc)[15:0]

which should make mapping a lot easier. (i assume) values shouldn't be as difficult since you're taking a number from the binary file and display it... as a number, likely in HEX.

another thing that i would see as difficult would be labels, since the assembler itself has no idea what each instruction does it would be impossible to figure out where to put labels...

unless the CPU file would somehow allow you to specify what instructions are conditional branches, subroutine calls, and returns. since those 3 are the only ones required to implement such an automatic labeling system.

anyways, it seems like a fun "little" challange to reverse this assembler somehow,

ProxyPlayerHD avatar Apr 08 '20 20:04 ProxyPlayerHD

Both issues closed ? Nobody working on this ? I 'm considering to implement this as a part of my custom cpu project ... Somebody interested ? (by the way - brilliant project :_)

virtimus avatar Oct 02 '21 11:10 virtimus

No, this issue isn't closed yet. It's just kind of a difficult-to-implement feature, so not really a priority. 😅

hlorenzi avatar Oct 06 '21 22:10 hlorenzi