wla-dx icon indicating copy to clipboard operation
wla-dx copied to clipboard

wla65816 doesn't accept a 16bit absolute properly

Open oziphantom opened this issue 5 years ago • 27 comments

say you do

LDA myLabel,x

you will get told that the address of myLabel doesn't fit into 8 bits and you have to do

LDA.w myLabel,x this also happens on STA CMP LDX ADC etc

abs can take an 8 bit or a 16bit address and the assembler should detect which version is needed automatically. Some forms can take a 24bit address for an Abs reference.

oziphantom avatar Feb 05 '20 12:02 oziphantom

This is an architectural flaw in WLA DX as the linker only knows, after it has placed the labels into ROM/RAM, the size of the label's addresses. A long standing issue. To fix this we'd need to merge the linker and the assembler... Which will not happen any time soon. :P

vhelin avatar Feb 05 '20 13:02 vhelin

so basically the assembler always inserts DP opcodes unless you tell it otherwise?

oziphantom avatar Feb 05 '20 13:02 oziphantom

If DP means opcodes that use 8-bit arguments, then the answer is yes. If I remember correctly, you can set the default behavior to 16-bit with directive .16BIT.

vhelin avatar Feb 05 '20 13:02 vhelin

Also related: #282; This may be somewhat solvable with better heuristics that can determine the instances where 8/16-bit operands are certain?

Kroc avatar Feb 05 '20 15:02 Kroc

I think in a way "to make it better" is make 16 bit the default. Then in the linker if it detects that the address is <$100, emit a warning "this could be more optimal with 8bit addressing mode". So this way the 95% of my code that deals with $0100-$FFFF is faster and easier to type. And when I forget to .b for Zero Page ( 6502) and Direct Page (65816) the assembler notifies me that it could be better with a .b. If I have a .w then the warning is not emitted, allowing me to silence it if I wanted to keep it 16 bits, due to self mod or for timing reasons. Then in 65816 if it detects the labels banks don't match and I don't have a "data page" or "program bank" set to match, it emits either an error or warning "address does not fit in 16bits, banks don't match" telling me to use .l form.

oziphantom avatar Feb 05 '20 16:02 oziphantom

I think in a way "to make it better" is make 16 bit the default. Then in the linker if it detects that the

I could turn the current v9.11 branch into v10.0 and make .16BIT the default mode. Changing the major version to signal that something bigger has changed...

address is <$100, emit a warning "this could be more optimal with 8bit addressing mode". So this

This is much easier and faster said than done. We'd need to let the linker know about all the places where a 8-bit value could have been used instead of a 16-bit value. Quite a big task. Lots of opcodes lists to go through, changes to the assembler, changes to the file formats, and new code to the linker as well... But a good idea!

way the 95% of my code that deals with $0100-$FFFF is faster and easier to type. And when I forget

As a quick fix, just put ".16BIT" at the beginning of your source code file. :)

to .b for Zero Page ( 6502) and Direct Page (65816) the assembler notifies me that it could be better with a .b. If I have a .w then the warning is not emitted, allowing me to silence it if I wanted to keep it 16 bits, due to self mod or for timing reasons. Then in 65816 if it detects the labels banks don't match and I don't have a "data page" or "program bank" set to match, it emits either an error or warning "address does not fit in 16bits, banks don't match" telling me to use .l form.

Good ideas!

vhelin avatar Feb 06 '20 20:02 vhelin

so the linker must already be doing something along those lines anyway. Other wise how does it know for lda (MyLabel),y that MyLabel needs to be 8bits and throw an error, it it doesn't do any address look-ups at assembly, then the linker must already parse a list of instructions to check for exceptions?? Also branches, it would have to do custom assembly logic to work out the branch value to a label as well right? You might find this very helpful upon your quest http://nparker.llx.com/a2/opcodes.html the instructions have a "formula" that you can easily work out the addressing mode from looking at the opcode byte. This lets you cut down a lot of possibilities. Basically anything that is abs abs,x has a ZP form. abs,y is the rare case in that only 2 instructions have a zp form and then it only has a ZP form.

oziphantom avatar Feb 07 '20 07:02 oziphantom

so the linker must already be doing something along those lines anyway. Other wise how does it know for lda (MyLabel),y that MyLabel needs to be 8bits and throw an error, it it doesn't do any address look-ups at assembly, then the linker must already parse a list of instructions to check for exceptions??

The assembler parses the instructions and emits the opcodes which are just plain bytes for the linker. The linker only gets pure bytes and then things like postponed 8-bit/16-bit/24-bit calculations, and references to labels.

Also branches, it would have to do custom assembly logic to work out the branch value to a label as well right? You might find this very helpful upon your quest http://nparker.llx.com/a2/opcodes.html the instructions have a "formula" that you can easily work out the addressing mode from looking at the opcode byte. This lets you cut down a lot of possibilities. Basically anything that is abs abs,x has a ZP form. abs,y is the rare case in that only 2 instructions have a zp form and then it only has a ZP form.

Thanks, that's a good info page, but I think the assembler already knows about different addressing modes... ?

vhelin avatar Feb 07 '20 07:02 vhelin

its for the linker to work out what the addressing mode of the byte it has is and what the opcode type is. So for example you can easily determine that the opcode is a branch etc rather than a large table of "opcodes" to rules. It can just look up the addressing mode from the byte before the label position, and see if it is abs abs,x and then do a "is this <$100" check.

oziphantom avatar Feb 07 '20 08:02 oziphantom

So you mean that the linker should do assembling? Anyway, if you have the solution to our problem, please feel free to create a pull request that fixes it. :)

vhelin avatar Feb 07 '20 14:02 vhelin

No the linker doesn't assemble. I was thinking it would know that it needs to put a Label in stream at X. To which it could inspect the opcode before it and check its address mode to see if it was an abs or abs,x to which it could then check to see if the label fits within 8bits and error/warn.

But I guess the assembler tells the linker "Put value from label of size X here", since the assembler doesn't know the address of any label, even when the entire asm file is given to it in 1 shot (all asm code is in the 1 file ). I guess the simpler way is adding the concept of "put 16bit address here but it can be 8bits" so the assembler knows that the opcode has a ZP/DP mode. Then the linker sees that it has a 16 bit address that can be 8 bits and errors/warns if it can fit into 8.

oziphantom avatar Feb 08 '20 04:02 oziphantom

No the linker doesn't assemble. I was thinking it would know that it needs to put a Label in stream at X. To which it could inspect the opcode before it and check its address mode to see if it was an abs or abs,x to which it could then check to see if the label fits within 8bits and error/warn.

But I guess the assembler tells the linker "Put value from label of size X here", since the assembler doesn't know the address of any label, even when the entire asm file is given to it in 1 shot (all asm

That's a special case. Not everybody will code like that. Some people split code into multiple files and use sections that move in the address space that the linker relocates and gives addresses to the labels inside the section...

code is in the 1 file ). I guess the simpler way is adding the concept of "put 16bit address here but it can be 8bits" so the assembler knows that the opcode has a ZP/DP mode. Then the linker sees that it has a 16 bit address that can be 8 bits and errors/warns if it can fit into 8.

And that's the bigger job I was talking about earlier. Do you have the spare time to do that? :) I cannot spend all my free time on coding WLA DX, unfortunately...

vhelin avatar Feb 08 '20 14:02 vhelin

I wasn't say because I code like this, this will work, I was using it as evidence for my hypothesis which I was asking you to confirm.. but it doesn't matter.

Where in the code does the 65816 assembler output the data to tell the linker insert resolved label X here? and where does the linker do the insertion?

oziphantom avatar Feb 09 '20 04:02 oziphantom

Where in the code does the 65816 assembler output the data to tell the linker insert resolved label X here? and where does the linker do the insertion?

decode_65816.c turns the code into bytes and puts references and pending calculations into the internal data format that is later in pass_4.c:1480 written into an object file.

wlalink/write.c:fix_references() writes the resolved label references.

vhelin avatar Feb 09 '20 14:02 vhelin

it would be great if this would follow the WDC standard

lda <dp ; Direct page lda |addr ; Absolute lda >addr ; Long

dwsJason avatar Feb 28 '20 18:02 dwsJason

it would be great if this would follow the WDC standard

lda <dp ; Direct page lda |addr ; Absolute lda >addr ; Long

I think these can be added as alternatives to opcodes_65816.c. Currently we have this there:

  { "ADC #x", 0x69, 4, 0 },
  { "ADC (x)", 0x72, 0xA, 0 },
  { "ADC [x]", 0x67, 0xA, 0 },
  { "ADC (x,X)", 0x61, 0xA, 0 },
  { "ADC (x),Y", 0x71, 0xA, 0 },
  { "ADC (x,S),Y", 0x73, 0xA, 0 },
  { "ADC [x],Y", 0x77, 0xA, 0 },
  { "ADC x", 0x65, 0xA, 2 },
  { "ADC ?", 0x6D, 2, 1 },
  { "ADC &", 0x6F, 3, 0 },
  { "ADC x,X", 0x75, 0xA, 2 },
  { "ADC ?,X", 0x7D, 2, 1 },
  { "ADC &,X", 0x7F, 3, 0 },
  { "ADC ?,Y", 0x79, 2, 0 },
  { "ADC x,S", 0x63, 0xA, 0 },

'x' is 8-bit, '?' is 16-bit and '&' is a 24-bit value/reference. Would

{ "ADC #<x", 0x69, 4, 0 },

be the same as the current

{ "ADC #x", 0x69, 4, 0 },

? What about this?

{ "ADC (x)", 0x72, 0xA, 0 },

Is it

{ "ADC (<x)", 0x72, 0xA, 0 },

using the WDC standard?

If someone could give me examples of all the ADC opcodes I think I can then do the rest the same fashion...

vhelin avatar Mar 01 '20 15:03 vhelin

Or does the WDC standard only touch these:

  { "ADC x", 0x65, 0xA, 2 },
  { "ADC ?", 0x6D, 2, 1 },
  { "ADC &", 0x6F, 3, 0 },
  { "ADC x,X", 0x75, 0xA, 2 },
  { "ADC ?,X", 0x7D, 2, 1 },
  { "ADC &,X", 0x7F, 3, 0 },
  { "ADC ?,Y", 0x79, 2, 0 },
  { "ADC x,S", 0x63, 0xA, 0 },

Having them in this format:

  { "ADC <x", 0x65, 0xA, 2 },
  { "ADC |?", 0x6D, 2, 1 },
  { "ADC >&", 0x6F, 3, 0 },
  { "ADC <x,X", 0x75, 0xA, 2 },
  { "ADC |?,X", 0x7D, 2, 1 },
  { "ADC >&,X", 0x7F, 3, 0 },
  { "ADC |?,Y", 0x79, 2, 0 },
  { "ADC <x,S", 0x63, 0xA, 0 },

... and keeping these as they are:

  { "ADC #x", 0x69, 4, 0 },
  { "ADC (x)", 0x72, 0xA, 0 },
  { "ADC [x]", 0x67, 0xA, 0 },
  { "ADC (x,X)", 0x61, 0xA, 0 },
  { "ADC (x),Y", 0x71, 0xA, 0 },
  { "ADC (x,S),Y", 0x73, 0xA, 0 },
  { "ADC [x],Y", 0x77, 0xA, 0 },

?

vhelin avatar Mar 01 '20 15:03 vhelin

basically what he is asking is every where you have a .b format use opcode <Thing (this is not the same as #< ) every where you have a .w format use opcode |Thing every where you have a .l format use opcode >Thing (this is not the same as #> )

oziphantom avatar Mar 01 '20 15:03 oziphantom

basically what he is asking is every where you have a .b format use opcode <Thing (this is not the same as #< ) every where you have a .w format use opcode |Thing every where you have a .l format use opcode >Thing (this is not the same as #> )

Ok, so this would mean that

"ADC #<x" (opcode 0x69)

would be an alias for

"ADC #x" (opcode 0x69)

among other things.

  { "ADC #<x", 0x69, 4, 0 },
  { "ADC (<x)", 0x72, 0xA, 0 },
  { "ADC [<x]", 0x67, 0xA, 0 },
  { "ADC (<x,X)", 0x61, 0xA, 0 },
  { "ADC (<x),Y", 0x71, 0xA, 0 },
  { "ADC (<x,S),Y", 0x73, 0xA, 0 },
  { "ADC [<x],Y", 0x77, 0xA, 0 },
  { "ADC <x", 0x65, 0xA, 2 },
  { "ADC |?", 0x6D, 2, 1 },
  { "ADC >&", 0x6F, 3, 0 },
  { "ADC <x,X", 0x75, 0xA, 2 },
  { "ADC |?,X", 0x7D, 2, 1 },
  { "ADC >&,X", 0x7F, 3, 0 },
  { "ADC |?,Y", 0x79, 2, 0 },
  { "ADC <x,S", 0x63, 0xA, 0 },

That would be ADC's using WDC standard. ?

vhelin avatar Mar 01 '20 16:03 vhelin

Ok, a question: If WLA-65816 spots this:

ADC >10

how does it know it's using WDC standard ("ADC (24-bit 10)") and doesn't mean "ADC (get high byte of 10)"?

vhelin avatar Mar 01 '20 17:03 vhelin

Ok, a question: If WLA-65816 spots this:

ADC >10

how does it know it's using WDC standard ("ADC (24-bit 10)") and doesn't mean "ADC (get high byte of 10)"?

I don't think there's any way to find the correct answer to that question, but I can always add a directive like .WDC which will make the parser use WDC standard and then the user can deal with low and high byte getters ('<' and '>')... Taking that route would not break backwards compatibility.

vhelin avatar Mar 01 '20 17:03 vhelin

Ok, use .WDC to change the parser to parse WDC standard assembly. I hope it works 100%.

vhelin avatar Mar 01 '20 20:03 vhelin

BTW, does the same WDC standard syntax apply to 6502 and 65C02?

vhelin avatar Mar 01 '20 22:03 vhelin

#<x is not <x As in < modifies an address param to get the low byte of it. while #<x takes the lo byte of the value of X so lda <x = A5 xx lda #<x = A9 xx adc <x = 65 xx adc #<x = 69 xx

it conflicting with < > in params is a problem. It's a relic of programming on machines that don't have as many keys as we do today. Like the Digraphs and Trigraphs of c, what happens if you are on a IBM 360 mainframe and can't type # use ??=

6502 would always be MOS standard. but there is no reason you couldn't use WDC standard 65C02 is err complicated R65C02 is probably MOS standard CSG65C02 is MOS Standard W65C02 is WDC Standard 65CE02 is MOS Standard but again if one wanted to, one could.

In other news: I think I've come up with a solid way to get what my original post is about when you inject a label, you put in "filler" bytes, those filler bytes are just a waste, so I pack extra info into them. if it is CD then don't do anything. if it is 08 then check to see if the label can be only 8 bits and throw a warning if it is 16 then check to see if the label can be only 16 bits and throw a warning but threading it through all the passes and code passes, I'm not sure if it works yet.. need to get back to testing it some more.

oziphantom avatar Mar 02 '20 05:03 oziphantom

#<x is not <x

That's a true statement, but do you mean WDC standard doesn't have '<' in "ADC #x"? So we should have the opcodes listed as follows: ?

  { "ADC #x", 0x69, 4, 0 },
  { "ADC (<x)", 0x72, 0xA, 0 },
  { "ADC [<x]", 0x67, 0xA, 0 },
  { "ADC (<x,X)", 0x61, 0xA, 0 },
  { "ADC (<x),Y", 0x71, 0xA, 0 },
  { "ADC (<x,S),Y", 0x73, 0xA, 0 },
  { "ADC [<x],Y", 0x77, 0xA, 0 },
  { "ADC <x", 0x65, 0xA, 2 },
  { "ADC |?", 0x6D, 2, 1 },
  { "ADC >&", 0x6F, 3, 0 },
  { "ADC <x,X", 0x75, 0xA, 2 },
  { "ADC |?,X", 0x7D, 2, 1 },
  { "ADC >&,X", 0x7F, 3, 0 },
  { "ADC |?,Y", 0x79, 2, 0 },
  { "ADC <x,S", 0x63, 0xA, 0 },

In other news: I think I've come up with a solid way to get what my original post is about when you inject a label, you put in "filler" bytes, those filler bytes are just a waste, so I pack extra info into them. if it is CD then don't do anything. if it is 08 then check to see if the label can be only 8 bits and throw a warning if it is 16 then check to see if the label can be only 16 bits and throw a warning but threading it through all the passes and code passes, I'm not sure if it works yet.. need to get back to testing it some more.

I hope you can get it to work! How do you differentiate those filler bytes from user given data?

vhelin avatar Mar 02 '20 07:03 vhelin

#<x is not <x

That's a true statement, but do you mean WDC standard doesn't have '<' in "ADC #x"? So we should have the opcodes listed as follows: ?

  { "ADC #x", 0x69, 4, 0 },
  { "ADC (<x)", 0x72, 0xA, 0 },
  { "ADC [<x]", 0x67, 0xA, 0 },
  { "ADC (<x,X)", 0x61, 0xA, 0 },
  { "ADC (<x),Y", 0x71, 0xA, 0 },
  { "ADC (<x,S),Y", 0x73, 0xA, 0 },
  { "ADC [<x],Y", 0x77, 0xA, 0 },
  { "ADC <x", 0x65, 0xA, 2 },
  { "ADC |?", 0x6D, 2, 1 },
  { "ADC >&", 0x6F, 3, 0 },
  { "ADC <x,X", 0x75, 0xA, 2 },
  { "ADC |?,X", 0x7D, 2, 1 },
  { "ADC >&,X", 0x7F, 3, 0 },
  { "ADC |?,Y", 0x79, 2, 0 },
  { "ADC <x,S", 0x63, 0xA, 0 },

Looks good.

In other news: I think I've come up with a solid way to get what my original post is about when you inject a label, you put in "filler" bytes, those filler bytes are just a waste, so I pack extra info into them. if it is CD then don't do anything. if it is 08 then check to see if the label can be only 8 bits and throw a warning if it is 16 then check to see if the label can be only 16 bits and throw a warning but threading it through all the passes and code passes, I'm not sure if it works yet.. need to get back to testing it some more.

I hope you can get it to work! How do you differentiate those filler bytes from user given data? The same way the system knows now. If the linker is patching in a label, then it is "padding" data right? If their is user data in the bytes that the linker is to replace with an address, then the system is broken? This is just not keeping the padding data $CD, and when it goes to inject a label, it checks to see what the padding data is first.

oziphantom avatar Mar 02 '20 07:03 oziphantom

#<x is not <x

That's a true statement, but do you mean WDC standard doesn't have '<' in "ADC #x"? So we should have the opcodes listed as follows: ? { "ADC #x", 0x69, 4, 0 }, { "ADC (<x)", 0x72, 0xA, 0 }, { "ADC [<x]", 0x67, 0xA, 0 }, { "ADC (<x,X)", 0x61, 0xA, 0 }, { "ADC (<x),Y", 0x71, 0xA, 0 }, { "ADC (<x,S),Y", 0x73, 0xA, 0 }, { "ADC [<x],Y", 0x77, 0xA, 0 }, { "ADC <x", 0x65, 0xA, 2 }, { "ADC |?", 0x6D, 2, 1 }, { "ADC >&", 0x6F, 3, 0 }, { "ADC <x,X", 0x75, 0xA, 2 }, { "ADC |?,X", 0x7D, 2, 1 }, { "ADC >&,X", 0x7F, 3, 0 }, { "ADC |?,Y", 0x79, 2, 0 }, { "ADC <x,S", 0x63, 0xA, 0 },

Looks good.

Ok, I'll fix the opcode list when I get back home later this evening...

In other news: I think I've come up with a solid way to get what my original post is about when you inject a label, you put in "filler" bytes, those filler bytes are just a waste, so I pack extra info into them. if it is CD then don't do anything. if it is 08 then check to see if the label can be only 8 bits and throw a warning if it is 16 then check to see if the label can be only 16 bits and throw a warning but threading it through all the passes and code passes, I'm not sure if it works yet.. need to get back to testing it some more.

I hope you can get it to work! How do you differentiate those filler bytes from user given data? The same way the system knows now. If the linker is patching in a label, then it is "padding" data right? If their is user data in the bytes that the linker is to replace with an address, then the system is broken? This is just not keeping the padding data $CD, and when it goes to inject a label, it checks to see what the padding data is first.

Ah, ofcourse... :) I still haven't really woken up, Monday morning...

vhelin avatar Mar 02 '20 08:03 vhelin