stella icon indicating copy to clipboard operation
stella copied to clipboard

Multi-bank ROM disassemblies

Open thrust26 opened this issue 4 years ago • 5 comments

Currently Stella only supports disassembling and debugging single banked ROMs. For standard Atari bankswitching the code has been enhanced already, but for a general solution, it needs refactoring. This also affects the CartXXX and CartXXXWidget classes, which do use only little inheritance and a lot of copy and paste.

See also: https://github.com/stella-emu/stella/issues/586

thrust26 avatar Mar 31 '20 16:03 thrust26

Things to do:

  • ~the Device::AccessFlags have to be set per bank for ROM and extended RAM access; this requires a lot of code changes; since only the Cart class knows the current bank, it should update the ROM and extended RAM accesses (instead of System).~ (already there)
  • ~CartDebug continues to call DiStella per bank, all bank related stuff has to be kept away from DiStella.~ (already there)
  • ~DiStella continues to disassemble per bank~ (already there)
  • ~either DiStella has to be refactored to make use of the enhanced access tracking; or this info stays encapsulated in CartDebug (the latter is better)~ (already there)
  • optional: DiStella's tentative code detection has to be limited to the current bank; this means the code has to check for bankswitching and stop there like it does for e.g. RTS
  • optional: DiStella's tentative code detection should be used multiple times per bank (this might improve non-bankswitched ROMs too)
  • analyze existing bugs which prevent ROM recreation from disassembly
    • data detected as code
    • when using symbol files, fix/remove listing of address labels (the may change value during disassembly)
    • when using symbol files, local labels can be duplicate and usually will have different values
    • ...
  • ~consider extra RAM in ROM space for savedis disassembly (works in debugger)~ (done)
  • for multi segmented ROMs (e.g. E0, 3E+), track the segment each bank is used in for better savedis.
  • ...?

thrust26 avatar Jan 01 '21 09:01 thrust26

Good multi bank disassemblies:

  • H.E.R.O. (F8)
  • Robot Tank (FE)

Bad multi bank disassemblies:

thrust26 avatar Jan 17 '21 08:01 thrust26

The attached simple program shows some problems with local labels in the disassembly:

The created .sym file contains everything, local labels are prefixed with a (global) counter:

--- Symbol List (sorted by symbol)
1.label                  f003              (R )
1.var                    0088              (R )
2.label                  f00a              (R )
2.var                    00aa              (R )
Test1                    f000              (R )
Test2                    f008              (R )
Var1                     0088              (R )
Var2                     00aa              (R )
--- End of Symbol List.

The disassembly created by DiStella has the following issues:

  • the first .var is missing and replaced with a global label
  • the first label is missing and replaced with a generated global label
  • the local labels .var and .label have the value of the 2nd subroutine
Var1            = $88
.var            = $aa
Test1           = $f000
Test2           = $f008
.label          = $f00a
...

Test1
    clc                             ;2        
    bcc     Lf003                   ;2/3 =   4
Lf003
    lda     Var1                    ;3        
    jmp     Test2                   ;3   =   6
    
Test2
    bcc     .label                  ;2/3 =   2
.label
    lda     .var                    ;3        
    jmp     Test1                   ;3   =   6

I also found that sometimes local labels are used by the disassembly, but not defined. And there is a problem with ORG. When using e.g. $f800 instead of $f000, the disassembly still uses $f000.

TestDiStella_00.zip

thrust26 avatar Jan 17 '21 11:01 thrust26

For ROMs without .sym/lst files the multi-bank disassembly is now close to 100% for standard bankswitchings. I tested with H.E.R.O. (100% perfect) and Fatal Run (1 byte different).

The difference in Fatal Run is due to an ambiguous opcode (nop zp,x) which can be assembled to 0x14, 0x34, 0x54, 0x74, 0xd4 and 0xf4. I suppose we could change the mnemonics to become distinct. But DASM would not assemble that.

Probably we should create byte sequences instead of opcodes here.

thrust26 avatar Jan 22 '21 11:01 thrust26

Found another problem with when testing Montezuma's Revenge (E0). There are are inter-bank branches, but the target addresses are from the current bank. This causes 'Branch out of range` errors.

L5_d000 
    sta     COLUP0                  ;3        
    lda     L5_d2b9,x               ;4        
    sta     PF0                     ;3        
    lda     L5_d2bd,x               ;4        
    sta     PF1                     ;3        
    txa                             ;2        
    tay                             ;2        
    lda     (ram_E1),y              ;5        
    sta     PF2                     ;3        
    dex                             ;2        
    bpl     L5_dfed                 ;2/3!     <- HERE!
    ldx     #$06                    ;2        
    dec     ram_BB                  ;5        
    bpl     L5_dfed                 ;2/3!     <- and HERE!

Obviously the two branches should go to the previous bank (e.g. L4_bfed). But since DiStella assembles each bank on its own, it cannot know previous bank's labels. And in this case, the code in the previous bank is not even detected as CODE. Probably the code should become e.g. bpl $cfed or bpl . - $26.

BTW: The relevant code was already touched by @sa666666 in 2010. :smile:

case AddressingMode::RELATIVE:
        {
          // SA - 04-06-2010: there seemed to be a bug in distella,
          // where wraparound occurred on a 32-bit int, and subsequent
          // indexing into the labels array caused a crash
          d1 = Debugger::debugger().peek(myPC + myOffset);  ++myPC;
          ad = ((myPC + Int8(d1)) & 0xfff) + myOffset;

thrust26 avatar Jan 23 '21 10:01 thrust26