sjasmplus icon indicating copy to clipboard operation
sjasmplus copied to clipboard

Documentation does not describe rules of MACRO and DEFINE substitution

Open ped7g opened this issue 5 years ago • 5 comments

Version Platform Topic
v1.10.4 all macro

There's almost zero description of how this process is done, which features are available and what results should be expected in various edge-cases.

There's feature of _ working as substitution delimiter documented by macro_test.asm.

There are some examples in documentation about macros and their expansion.

And there's reasonably detailed/formal documentation about labels.

Question/task: provide better formal description how the macro and define system should work.

If there already are Z80 assemblers with similar functionality, see if their definition is better, and if sjasmplus can converge to common behaviour of various assemblers.

And then fill the gaps where nobody formalized remaining edge-cases.

ped7g avatar Mar 09 '19 19:03 ped7g

My current ideas about possible rules:

  • macro-arguments/defines would try to substitute from the longest terms to shortest
  • macro arguments (inside macro) are substituted first (before defines)
  • defines are substituted before instruction/labels evaluation, including arrays (like in current version)
  • finally instructions/labels are evaluated and assembled

Both macro and define substitutions work similar to C preprocessor, i.e. they treat the source line as string and replace whole words/sub-words if match is found, replacing them with yet another string (no evaluation of expressions).

The arrays are then broken, as they need to evaluate index expression to resolve their substitution, not sure how to formalize this, probably macro-args -> defines -> try arrays -> evaluate array index if needed and possible -> if array was replaced successfully, start again whole loop, with macro arguments substitution -> if no more array substitution happened, do the final assembling of the resulting string.

Sub-words matching: any string containing underscore is considered in matching as a whole string, and as all possible combinations of substrings on the underscores boundaries, i.e. result_text_string is being matched against current macro-arguments and defines as { result_text_string, result_text, text_string, result, text, string } (plus their variants with starting/ending underscore where possible).

Macro-arguments and defines starting their name with underscore are blocked to substitute only at the beginning of word, so in the example above, for example with define _VERSION, only sub-words { result_text_string, result_text, result } would be matched against such macro-arg/define (and none of those start with underscore = no match). This is to prevent unexpected substitutions in source with things like ld hl,MY_VERSION, where the built-in sjasmplus define _VERSION would be ready to interfere without this rule.

As of v1.12.0, the sub-word matching is sort of done this way, but priorities are not, so priority of substitution depend on the particular source line, and order in which the macros/defines were defined, which makes the assembler unpredictable (and one can't tell how particular line will precisely assemble until whole source + command line options are considered).

(this is RFC comment, to see if these proposed rules are reasonable or what problems will they introduce)

ped7g avatar Apr 08 '19 09:04 ped7g

The more nearly it resembles cpp the easier it will be for most of us to use and you to support and document. K&R got it right.

If you prefer an ASM-centric model, HiSoft’s Devpac QL (like metacomco’s ASM) got it right; Z80 Devpac, GENS4, was weaker.

SimonGoodwin avatar May 08 '19 21:05 SimonGoodwin

Simon: thank you for the input, but I have difficult time to apply it. CPP - but cpp has no text substitution like this? It has only standard C preprocessor which is clearly done ahead of compile time, i.e. #define zzz just_string, while sjasmplus has hybrid on-going "preprocessor" taking actual values during different passes, i.e. define zzz macroArgument will evaluate differently depending what was the argument value when emitting the macro. It's probably closer to templates and specialized implementations of them, but can't see at this moment which part to pick to improve current sjasmplus or the proposal above, seems too different to apply things between.

I unfortunately don't know how other things did work. The sjasmplus in current version (and already in v1.07 builds) has sort of unique ability to substitute also sub-words in certain conditions, which is in cooperation with macros quite powerful feature, and actively used by some developers, so I'm trying to preserve it in mostly-backward-compatible way, although my recent versions already operate a bit differently, but the old sources using this I had available fit into the modified scheme well.

The proposal in second comment is still mostly backward compatible, just making things even more formalized and more predictable from programmer point of view.

ped7g avatar May 09 '19 05:05 ped7g

Could you please comment about the current state of the priorities in which substitutions are made? Have there been any changes since 1.12.0? Or would the current best practice be to simply avoid situations where the order of substitution would matter?

specke avatar Aug 29 '20 18:08 specke

@specke no code change since 1.12.0 ... so it's deterministic only if you know the implementation itself, which is not desired. But it's current state.

ped7g avatar Aug 29 '20 19:08 ped7g