wla-dx
wla-dx copied to clipboard
65816 code outside bank 0
obj/lz4.o: asm/lz4.s:40: COMPUTE_PENDING_CALCULATIONS: Result (170791/$29b27) of a computation is out of 16-bit range. My expectation is that the assembler / linker will just take the lowest 16 bits of the address without complaining.
This is some self modifying code.
lda pDest,s
sta.w LZ4_Dst+1
LZ4_Dst LDY #$0000 ; Init Target unpacked Data offset
Maybe this isn't a bug, maybe I'm not using the assembler correctly. Any thoughts appreciated.
My expectation is that the assembler / linker will just take the lowest 16 bits of the address without complaining.
Overflows give always an error. AND the bits you need e.g. using "& $FFFF". Sorry. :) Which of the lines is asm/lz4.s:40?
The sta.w LZ4_Dst+1, is line 40
Thanks so much for taking the time to consider this input.
If I don't have code in bank 0, then I need to actually AND my address field? I'm going to include the entire source file. "ASMCODE" address space is set to $20000, and has a size of $10000 (configured as a slot in the memory map)
Since the code does PHK/PLB, the bank is the same as the code. So the code that accesses addresses, can, and should use absolute addressing.
Having to add an &$FFFF to the actual address seems inconvenient. IORCA does not have this requirement, neither does Merlin, or x65, or the APW assembler, or the WDC assembler. So it does seem like a very non-standard requirement.
I'm hoping that I either misunderstand how to use your assembler, or that I did not adequately explain my issue.
`*
- LZ4 Decompress by Brutal Deluxe!!!
.include common.i .include dp.s
mx %00
; .export unpacklz4 ; .export LZ4_Unpack .slot "ASMCODE"
; .base $20 ; .org $200000
.section "ASMCODE"
- int LZ4_Unpack(u8* pDest, u8* pPackedSource);
LZ4_Unpack:
; This syntax is not ideal .equ pDest 5 .equ pPackedSource 9
phb
phk
plb
sep #$20
lda pPackedSource+2,s ; Pull out the src/dst banks
xba
lda pDest+2,s ; Pull out the src/dst banks
rep #$31
tax ; Temp save in X
lda pDest,s
sta.w LZ4_Dst+1
lda pPackedSource+1,s ; address of packed source + 4, is the unpacked len
sta.w upl+2
lda pPackedSource,s
adc #12
sta.w upl+1
upl lda.l >0 ; packed length adc #16 ; 16 bytes for packed buffer header adc pPackedSource,s ; start of packed buffer tay ; y has the pack data stop address
; 1st packed Byte offset
lda pPackedSource,s ; skip 16 byte header on the source
adc #16
pha
txa
plx
jsr ASM_LZ4_Unpack
tay
; Copy the Return address
lda 1,s
sta pPackedSource,s
lda 3,s
sta pPackedSource+2,s
tsc
sec
sbc #-8
tcs
tya ; return length
plb
rtl
*-------------------------------------------------------------------------------
ASM_LZ4_Unpack STA.w LZ4_Literal_3+1 ; Uncompress a LZ4 Packed Data buffer (64 KB max)
SEP #$20 ; A = Bank Src,Bank Dst
STA.w LZ4_Match_5+1 ; X = Header Size = 1st Packed Byte offset
STA.w LZ4_Match_5+2 ; Y = Pack Data Size
XBA ; => Return in A the length of unpacked Data
STA.w LZ4_ReadToken+3
STA.w LZ4_Match_1+3
STA.w LZ4_GetLength_1+3
REP #$30
STY.w LZ4_Limit+1
*--
LZ4_Dst LDY #$0000 ; Init Target unpacked Data offset
LZ4_ReadToken LDA.l >$AA0000,X ; Read Token Byte
INX
STA.w LZ4_Match_2+1
*----------------
LZ4_Literal AND #$00F0 ; >>> Process Literal Bytes <<<
BEQ LZ4_Limit ; No Literal
CMP #$00F0
BNE LZ4_Literal_1
JSR LZ4_GetLengthLit ; Compute Literal Length with next bytes
BRA LZ4_Literal_2
LZ4_Literal_1 LSR A ; Literal Length use the 4 bit
LSR A
LSR A
LSR A
*--
LZ4_Literal_2 DEC A ; Copy A+1 Bytes
LZ4_Literal_3 MVN $AA,$BB ; Copy Literal Bytes from packed data buffer
PHK ; X and Y are auto incremented
PLB
*----------------
LZ4_Limit CPX #$AAAA ; End Of Packed Data buffer ?
BEQ LZ4_End
*----------------
LZ4_Match TYA ; >>> Process Match Bytes <<<
SEC
LZ4_Match_1 SBC.l >$AA0000,X ; Match Offset
INX
INX
STA.w LZ4_Match_4+1
*--
LZ4_Match_2 LDA #$0000 ; Current Token Value
AND #$000F
CMP #$000F
BNE LZ4_Match_3
JSR LZ4_GetLengthMat ; Compute Match Length with next bytes
LZ4_Match_3 CLC
ADC #$0003 ; Minimum Match Length is 4 (-1 for the MVN)
*--
PHX
LZ4_Match_4 LDX #$AAAA ; Match Byte Offset
LZ4_Match_5 MVN $BB,$BB ; Copy Match Bytes from unpacked data buffer
PHK ; X and Y are auto incremented
PLB
PLX
*----------------
BRA LZ4_ReadToken
*----------------
LZ4_GetLengthLit LDA #$000F ; Compute Variable Length (Literal or Match)
LZ4_GetLengthMat STA.w LZ4_GetLength_2+1
LZ4_GetLength_1 LDA.l >$AA0000,X ; Read Length Byte
INX
AND #$00FF
CMP #$00FF
BNE LZ4_GetLength_3
CLC
LZ4_GetLength_2 ADC #$000F
STA.w LZ4_GetLength_2+1
BRA LZ4_GetLength_1
LZ4_GetLength_3 ADC.w LZ4_GetLength_2+1
RTS
*----------------
LZ4_End TYA ; A = Length of Unpack Data
RTS
*-------------------------------------------------------------------------------
.ends ; Gotta close the section
`
Hard to say anything about this. It's missing the includes
.include common.i .include dp.s
Without them I cannot assemble it... Could you try to simplify the case to something smaller? But by default, if WLA expects a 16-bit value, and you give it something bigger, you'll get an error. That's something I'd expect the assembler I use to do as silent ANDs sound dangerous and probable sources of errors. IORCA, Merlin, x65, APW or the WDC assembler must be more forgiving than WLA DX.
I mean, if you try to do something like
.db 256+100
and expect that to produce 100 and the assembler is silent about the overflow, what are you doing? Does that make any sense at all? :D
I don't try to do
char c = 10000;
in ANSI C and hope it works. :D
@Ramsis-SNES is the OP using WLA here somehow wrong, or what do you think? Should I allow truncation of 24bit values -> 16bit values perhaps via a command line switch?
I'm going to look into this later today. :-)
@Ramsis-SNES is the OP using WLA here somehow wrong, or what do you think? Should I allow truncation of 24bit values -> 16bit values perhaps via a command line switch?
I'm leaning towards a "no". Let me elaborate.
The LZ4 decompression routine provided by the OP (original source, intended for use on the Apple IIgs) uses self-modifying code. This means that the bank number will likely change depending on where in RAM the code is copied for execution. Because of this, it would make sense to write the code in such a way that it doesn't take the bank into account in the first place. So yeah, adding & $FFFF
, or using indirect or even relative addressing modes, would be the most sensible approach. Alternatively, one could pre-define the offsets and use the defined labels instead of those within the code itself, which would become necessary anyway if the code was run from a different offset in RAM than where it sits in ROM.
Adding a switch to truncate 24-bit addresses to 16-bit addresses would possibly create more traps for young players (e.g., happily coding away and not knowing or losing track of which bank is accessed at any given time) than it would provide benefits for everyone. Also, where do we stop? Should we add another switch to truncate to 8-bit addresses next? ;-)
By the way:
Since the code does PHK/PLB, the bank is the same as the code. So the code that accesses addresses, can, and should use absolute addressing.
Having to add an &$FFFF to the actual address seems inconvenient. IORCA does not have this requirement, neither does Merlin, or x65, or the APW assembler, or the WDC assembler. So it does seem like a very non-standard requirement.
On page 22 of WDC's assembler/linker documentation, where it talks about absolute addresses like the OP specified, it is stated:
If the address is greater than FFFFH, then it assumes that it is a long absolute address.
So, while I don't know about the other assemblers mentioned, it seems at least doubtful that the WDC assembler would silently truncate absolute addresses to 16-bits. ;-)
Thank you for your insight! Based on this I think I'll close this issue. I don't understand everything that is going on here so for me to fix an issue related to this, if there is an issue, should be something I can undestand. :)
It seems the OP wrote
sta.w LZ4_Dst+1
when LZ4_Dst was a 24-bit address/value? Maybe here we should have used
sta.l LZ4_Dst+1
instead?
I have to add that I'm no 65816 programmer and 65816 is the platform I have least idea of what is going on there. I've added 65816 support to WLA DX with the help of people who actually know about 65816 programming. So all help is appreciated. :)
It seems the OP wrote
sta.w LZ4_Dst+1
when LZ4_Dst was a 24-bit address/value? Maybe here we should have used
sta.l LZ4_Dst+1
instead?
Yes, this might work if a) the implied/expected bank number ($02 in this case) stays the same after the code is copied to and run from RAM (which in turn might indeed be the case in a LoROM game), and b) the code isn't timing-sensitive (24-bit addressing takes more cycles than 16-bit addressing).
I have to add that I'm no 65816 programmer and 65816 is the platform I have least idea of what is going on there. I've added 65816 support to WLA DX with the help of people who actually know about 65816 programming.
Wow, I didn't know that. Regardless, I think WLA's 65816 assembler is as good as it gets! There simply is no alternative. And best of all, it's actively being developed! :D
I have to add that I'm no 65816 programmer and 65816 is the platform I have least idea of what is going on there. I've added 65816 support to WLA DX with the help of people who actually know about 65816 programming.
Wow, I didn't know that. Regardless, I think WLA's 65816 assembler is as good as it gets! There simply is no alternative. And best of all, it's actively being developed! :D
LOL, thank you for this. :) I try to do my best, but I do need help once in a while from you guys. :)