radare2 misinterpreting instructions as a string for a DOS binary
Hello,
I've been comparing the output of radare2 with the DOSBox debugger.
I'm working on fire.exe from the DOS video game Fire & Ice (Renegade, 1993).
% sha256sum FIRE.EXE
ca0e8e19074351bf3ba6ca69c537220ec4a75077b376ecd458a92e4e4255e469 FIRE.EXE
Starting at entry point, DOSBox, is going to display the following:
11A8:000E 0E push cs
11A8:000F 1F pop ds
11A8:0010 06 push es
11A8:0011 8B0E0C00 mov cx,[000C] ds:[000C]=F000
11A8:0015 8BF1 mov si,cx
11A8:0017 4E dec si
11A8:0018 89F7 mov di,si
11A8:001A 8CDB mov bx,ds
11A8:001C 031E0A00 add bx,[000A] ds:[000A]=20C8
11A8:0020 8EC3 mov es,bx
11A8:0022 FD std
11A8:0023 F3A4 repe movsb
11A8:0025 53 push bx
11A8:0026 B82B00 mov ax,002B
11A8:0029 50 push ax
11A8:002A CB retf
11A8:002B 2E8B2E0800 mov bp,cs:[0008] cs:[0008]=0FBB
11A8:0030 8CDA mov dx,ds
11A8:0032 89E8 mov ax,bp
11A8:0034 3D0010 cmp ax,1000
11A8:0037 7603 jbe 0000003C ($+3) (down)
11B9:0039 B80010 mov ax,1000
11B9:003C 29C5 sub bp,ax
11B9:003E 29C2 sub dx,ax
11B9:0040 29C3 sub bx,ax
11B9:0042 8EDA mov ds,dx
11B9:0042 8EDA mov ds,dx
11B9:0044 8EC3 mov es,bx
11B9:0046 B103 mov cl,03
11B9:0048 D3E0 shl ax,cl
11B9:004A 89C1 mov cx,ax
11B9:004C D1E0 shl ax,1
11B9:004E 48 dec ax
11B9:004F 48 dec ax
11B9:0050 8BF0 mov si,ax
11B9:0052 8BF8 mov di,ax
11B9:0054 F3A5 repe movsw
11B9:0056 09ED or bp,bp
11B9:0058 75D8 jne 00000032 ($-28) (no jmp)
...
while, radare2 is going to display the following
/ (fcn) entry0 29
| 0000:fdbe 0e push cs
| 0000:fdbf 1f pop ds
| 0000:fdc0 06 push es
| 0000:fdc1 8b0e0c00 mov cx, word [0xc] ; [0xc:2]=0xffff loc.0000ffff
| 0000:fdc5 8bf1 mov si, cx
| ; DATA XREF from 0x0000230b (unk)
| 0000:fdc7 4e dec si
| 0000:fdc8 89f7 mov di, si
| ; DATA XREF from 0x000210ce (unk)
| 0000:fdca 8cdb mov bx, ds
| ; DATA XREF from 0x000dc8a0 (int.000dc897)
| 0000:fdcc 031e0a00 add bx, word [0xa]
| 0000:fdd0 8ec3 mov es, bx
| 0000:fdd2 fd std
| 0000:fdd3 f3a4 rep movsb byte es:[di], byte ptr [si]
| 0000:fdd5 53 push bx
| 0000:fdd6 b82b00 mov ax, 0x2b ; '+'
| 0000:fdd9 50 push ax
\ 0000:fdda cb retf
0000:fddb 2e8b2e0800 mov bp, word cs:[8] ; [0x8:2]=32
0000:fde0 8cda mov dx, ds
| ; JMP XREF from 0x0000665f (fcn.00006658)
| .-> 0000:fde2 89e8 mov ax, bp
| | ; DATA XREF from 0x00015cd5 (fcn.00015c42)
| | ; DATA XREF from 0x00009b8e (fcn.00009b8a)
| | 0000:fde4 3d0010 cmp ax, 0x1000
| ,==< 0000:fde7 7603 jbe 0xfdec
| || 0000:fde9 b80010 mov ax, 0x1000
| `--> 0000:fdec 29c5 sub bp, ax
| | 0000:fdee 29c2 sub dx, ax
| | 0000:fdf0 .string ")\x,3\x(>\x-:\x(>\x,3\x+1" ; len=8
| | 0000:fdf8 d3e0 shl ax, cl
| | 0000:fdfa 89c1 mov cx, ax
| | 0000:fdfc d1e0 shl ax, 1
| | 0000:fdfe 48 dec ax
| | ; DATA XREF from 0x00005a15 (unk)
| | ; DATA XREF from 0x00007fd4 (fcn.00007fc5)
| | 0000:fdff 48 dec ax
| | 0000:fe00 8bf0 mov si, ax
| 0000:fe02 8bf8 mov di, ax
| 0000:fe04 f3a5 rep movsd dword es:[di], dword ptr [si]
| 0000:fe06 09ed or bp, bp
`=< 0000:fe08 75d8 jne 0xfde2 ; fcn.00006658+0x978a
...
You can see that, for DOSBox, at position 11B9:0040, the bytes 29C38EDA8EDA8EC3B103 corresponds to the instructions sub bx,ax; mov ds,dx; mov ds,dx; mov es,bx; mov cl,03, while radare2, at position 0000:fdf0, interprets the same bytes as a string, which bothers me because theses instructions are being executed by DOSBox when the game is launched and analyzed by the debugger.
I would like to know if it's a radare2 bug or if I made a mistake when analysing the binary.
Hello,
Ensure you are using radare2 from git, if you're unsure paste output of r2 -v here.
To install radare2 from git, first uninstall your version of radare2 and clean your distro. Then use git clone https://github.com/radare/radare2 && cd radare2 && ./sys/install.sh, verify your version and check if there is no error using r2 -v.
Hello @Maijin,
I'm using the following version:
% r2 -v
radare2 0.10.5 9999999 @ linux-x86-64 git.0.10.5
commit: HEAD build: 2016-08-23
I'm installing radare2 from git right away.
@Maijin, I've tried with the latest version of radare2, but the result is still the same.
% r2 -v
radare2 0.10.6-git 12465 @ linux-x86-64 git.0.10.5-343-g49cab15
commit: 49cab152000a2cf4c59eb58e5c41e6415d497529 build: 2016-09-23
I see it's an abandonware right, can you link the binary here (You can drop a .zip on github containing the executable)
<3 Your avatar, I love Goblins <3
Here is the full game, zipped. fire.zip @Maijin Thank you for the nice comment. I know some of the people who worked at Coktel.
Goblins guys are true genius !
@marespiaut Found some easter eggs in Adibou using r2 :P but not yet found how to trigger them in the game :P
@Maijin We need to talk via email, or IRC, I think we have some interest in common.
You can join #radare on irc or use telegram https://telegram.me/joinchat/ACR-FgWyg1bbu9YUzT_5pg
-e bin.strings=false.
Theres some code in rbin that defines the rules to consider something a string or not. We should make those checks more stricts for msdos binaries. Will check when i have some time
Thanks for reporting
On 23 Sep 2016, at 19:18, Marc-Alexandre Espiaut [email protected] wrote:
Hello,
I've been comparing the output of radare2 with the DOSBox debugger. I'm working on fire.exe from the DOS video game Fire & Ice (Renegade, 1993).
% sha256sum FIRE.EXE ca0e8e19074351bf3ba6ca69c537220ec4a75077b376ecd458a92e4e4255e469 FIRE.EXE Starting at entry point, DOSBox, is going to display the following:
11A8:000E 0E push cs 11A8:000F 1F pop ds 11A8:0010 06 push es 11A8:0011 8B0E0C00 mov cx,[000C] ds:[000C]=F000 11A8:0015 8BF1 mov si,cx 11A8:0017 4E dec si 11A8:0018 89F7 mov di,si 11A8:001A 8CDB mov bx,ds 11A8:001C 031E0A00 add bx,[000A] ds:[000A]=20C8 11A8:0020 8EC3 mov es,bx 11A8:0022 FD std 11A8:0023 F3A4 repe movsb 11A8:0025 53 push bx 11A8:0026 B82B00 mov ax,002B 11A8:0029 50 push ax 11A8:002A CB retf 11A8:002B 2E8B2E0800 mov bp,cs:[0008] cs:[0008]=0FBB 11A8:0030 8CDA mov dx,ds 11A8:0032 89E8 mov ax,bp 11A8:0034 3D0010 cmp ax,1000 11A8:0037 7603 jbe 0000003C ($+3) (down) 11B9:0039 B80010 mov ax,1000 11B9:003C 29C5 sub bp,ax 11B9:003E 29C2 sub dx,ax 11B9:0040 29C3 sub bx,ax 11B9:0042 8EDA mov ds,dx 11B9:0042 8EDA mov ds,dx 11B9:0044 8EC3 mov es,bx 11B9:0046 B103 mov cl,03 11B9:0048 D3E0 shl ax,cl 11B9:004A 89C1 mov cx,ax 11B9:004C D1E0 shl ax,1 11B9:004E 48 dec ax 11B9:004F 48 dec ax 11B9:0050 8BF0 mov si,ax 11B9:0052 8BF8 mov di,ax 11B9:0054 F3A5 repe movsw 11B9:0056 09ED or bp,bp 11B9:0058 75D8 jne 00000032 ($-28) (no jmp) ... while, radare2 is going to display the following
/ (fcn) entry0 29 | 0000:fdbe 0e push cs | 0000:fdbf 1f pop ds | 0000:fdc0 06 push es | 0000:fdc1 8b0e0c00 mov cx, word [0xc] ; [0xc:2]=0xffff loc.0000ffff | 0000:fdc5 8bf1 mov si, cx | ; DATA XREF from 0x0000230b (unk) | 0000:fdc7 4e dec si | 0000:fdc8 89f7 mov di, si | ; DATA XREF from 0x000210ce (unk) | 0000:fdca 8cdb mov bx, ds | ; DATA XREF from 0x000dc8a0 (int.000dc897) | 0000:fdcc 031e0a00 add bx, word [0xa] | 0000:fdd0 8ec3 mov es, bx | 0000:fdd2 fd std | 0000:fdd3 f3a4 rep movsb byte es:[di], byte ptr [si] | 0000:fdd5 53 push bx | 0000:fdd6 b82b00 mov ax, 0x2b ; '+' | 0000:fdd9 50 push ax \ 0000:fdda cb retf 0000:fddb 2e8b2e0800 mov bp, word cs:[8] ; [0x8:2]=32 0000:fde0 8cda mov dx, ds | ; JMP XREF from 0x0000665f (fcn.00006658) | .-> 0000:fde2 89e8 mov ax, bp | | ; DATA XREF from 0x00015cd5 (fcn.00015c42) | | ; DATA XREF from 0x00009b8e (fcn.00009b8a) | | 0000:fde4 3d0010 cmp ax, 0x1000 | ,==< 0000:fde7 7603 jbe 0xfdec | || 0000:fde9 b80010 mov ax, 0x1000 |
--> 0000:fdec 29c5 sub bp, ax | | 0000:fdee 29c2 sub dx, ax | | 0000:fdf0 .string ")\x,3\x(>\x-:\x(>\x,3\x+1" ; len=8 | | 0000:fdf8 d3e0 shl ax, cl | | 0000:fdfa 89c1 mov cx, ax | | 0000:fdfc d1e0 shl ax, 1 | | 0000:fdfe 48 dec ax | | ; DATA XREF from 0x00005a15 (unk) | | ; DATA XREF from 0x00007fd4 (fcn.00007fc5) | | 0000:fdff 48 dec ax | | 0000:fe00 8bf0 mov si, ax | 0000:fe02 8bf8 mov di, ax | 0000:fe04 f3a5 rep movsd dword es:[di], dword ptr [si] | 0000:fe06 09ed or bp, bp=< 0000:fe08 75d8 jne 0xfde2 ; fcn.00006658+0x978a ... You can see that, for DOSBox, at position 11B9:0040, the bytes 29C38EDA8EDA8EC3B103 corresponds to the instructions sub bx,ax; mov ds,dx; mov ds,dx; mov es,bx; mov cl,03, while radare2, at position 0000:fdf0, interprets the same bytes as a string, which bothers me because theses instructions are being executed by DOSBox when the game is launched and analyzed by the debugger.I would like to know if it's a radare2 bug or if I made a mistake when analysing the binary.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
this happens because this binary contains 0 sections
just use -e bin.strings=false as a workaround, until i get a random idea to solve this problem
I think this bug affects bios images too, any other restriction to handle that?
https://github.com/radare/radare2-regressions/pull/597
@marespiaut Can you add tests for that ?
@Maijin I can't, because the binary isn't freeware.
And you can't find a similar behavior in a freeware one?
@Maijin Not yet.
even by removing/stripping all the proprietary content of the dos binary and keeping only the faulty code somehow ?
Until i get a reproducer to fix it you can just -e bin.strings=false