rgbds icon indicating copy to clipboard operation
rgbds copied to clipboard

[Feature Request] Literals (inline section fragments)

Open popeyeotaku opened this issue 4 years ago • 17 comments

This is gonna be the weirdest feature request, and I don't mind if you don't think it worth adding. But I've dabbled with some old Mainframe assemblers from back in the 70s, and one of them had an interesting feature they called "Literals".

Basically, you could put code in square brackets, and it would cause that code to be assembled in a block at the end of your code (you could also relocated this block to other places, and I think you could redefine it several times thru your code), and the square brackets would be replaced with the address it ended up getting assembled to.

So you could type things like:

ld hl, [.db "HELLO WORLD"] call Print

and the string would get assembled to some place, and replaced in the hl with its address. You could also use it to call a short block of code that returns, things like that.

Like I said, while useful, this is not exactly a standard feature, and I don't mind if you don't think it worth adding.

popeyeotaku avatar Apr 05 '20 12:04 popeyeotaku

The real issue is, where to put it? "Some place" isn't too descriptive. Perhaps at the end of the current section?

That being said, it seems like an interesting way to do jump tables:

  ld hl, .table - 2
.loop
  inc hl
  inc hl
  cp [hl]
  inc hl
  jr nc, .loop
  ld a, [hli]
  ld h, [hl]
  ld l, a
  jp hl

.table
  dbw 50, [
    ld hl, [db "Try harder!", 0]
    jp ShowMessage
  ]
  dbw 120, [
    ld hl, [db "Good work.", 0]
    call ShowMessage
    ld hl, SND_CHIME
    jp PlaySound
  ]
  dbw 255, [
    ld hl, [db "Excellent work!", 0]
    call ShowMessage
    ld hl, SND_FANFARE
    jp PlaySound
  ]

aaaaaa123456789 avatar Apr 05 '20 13:04 aaaaaa123456789

It can be an explicitly marked constant pool, similar to ARM assemblers.

LIJI32 avatar Apr 05 '20 13:04 LIJI32

The one I saw defaulted to the end of the assembly after your code, but there was a command to relocate it to wherever else you wanted

popeyeotaku avatar Apr 05 '20 13:04 popeyeotaku

Note that square brackets collide with memory accesses. In fact, this would mean that parsing is no longer context-free:

    ld hl, [niladic_macro_or_a_label_who_knows]

I'd suggest using curly braces instead.

    dbw 120, {
      ld hl, {db "Good work.", 0}
      call ShowMessage
      ld hl, SND_CHIME
      jp PlaySound
    }

Note that this kind of delimiters is used in quite a big family of languages already ;)

meithecatte avatar Apr 05 '20 13:04 meithecatte

Unresolved question: how would label scoping work here?

meithecatte avatar Apr 05 '20 13:04 meithecatte

Unresolved question: how would label scoping work here?

I assume locals before a global would be local to the braces and globals would work as always.

aaaaaa123456789 avatar Apr 05 '20 13:04 aaaaaa123456789

I saw this feature in the rare MIDAS assembler for PDP-10, which you'd think would be hard to get working or find docs for, but actually an easy self installing emulation package and extensive documentation for Midas and its OS are here ITS on Github

popeyeotaku avatar Apr 05 '20 13:04 popeyeotaku

Quick comment: parens, brackets and braces cannot work for this. Parens are used for expressions, brackets by memory accesses, and braces by symbol expansions (including outside of strings). So something else would be needed.

Other than that, I'm not really sure about how useful this is; my main concern with this is code mixed in the middle of data, which I'm afraid would hurt readability in cases more trivial than in the OP.

Implementation doesn't sound too complicated, but I'm seeing major pain points, such as specifying the location at which the code should be stored, or the fact that looks a lot like LOAD, which is already very fragile. From my point of view, this doesn't add anything significant (prove me wrong though) so I'm not sure if it's worth the additional complexity.

ISSOtm avatar Apr 05 '20 15:04 ISSOtm

After further discussion, I think this might be worth it though in fairly minor ways (compared to current solutions), so I won't close this, but this is low priority as far as I'm concerned. Anyone else willing to take a stab at it is welcome, however.

ISSOtm avatar Apr 05 '20 19:04 ISSOtm

It can be an explicitly marked constant pool, similar to ARM assemblers.

The proposition here is quite different from ARM assemblers. The literal pools there are typically used when numeric immediates are used that cannot be represented in 8-bits, (combined with some number of rotates as per the spec). It is also used for syntax like LDR R0, =label where label is an address label, not a storage directive as the OP desires. A literal pool would then be set up which contains the address of label.

tl;dr I don't think this is a good idea, not least for the combining code/data argument. Also, I can't imagine a use case where such a syntax would be appropriate.

craben20 avatar Apr 18 '20 17:04 craben20

Quick comment: parens, brackets and braces cannot work for this. Parens are used for expressions, brackets by memory accesses, and braces by symbol expansions (including outside of strings). So something else would be needed.

In this case, double brackets would work perfectly. This might be complicated to parse, though, since the lexer would have to know to parse [[ as a single token instead of reading it as two [ symbols. But this is a long-term feature anyway...

aaaaaa123456789 avatar Apr 18 '20 20:04 aaaaaa123456789

If #244 were implemented, allowing two sections to be in the same bank even if they're floating, this could make use of it. The parser could allow reloc_16bit or reloc_16bit_no_str to be inline_code, which would parse as '[[' lines ']]'. The [[ would push a new anonymous section in the same bank as the current one (also backing up the current nListCountEmpty and nPCOffset values); the ]] would pop the section, restore nListCountEmpty and nPCOffset, and evaluate the whole inline code block to that section's address (with some new rpn_AddrSection function like rpn_BankSection).

Rangi42 avatar Jan 23 '21 23:01 Rangi42

The problem of implementing #244 is that there's no way to adapt the bin packing algorithm to work with the additional constraints.

ISSOtm avatar Jan 23 '21 23:01 ISSOtm

As you said in Discord, section fragments would actually be fine here. Without this "literal" feature, people would manually put code blocks in the same section anyway, so it's fine if each [[ block ]] just becomes a FRAGMENT of its current SECTION and they all stay contiguous. It will, however, have to wait for #712 to merge, since the "literal" fragments won't have any alignment constraints but the actual section might. (If a [[ literal ]] is created, it would update the current section to have the fragment modifier.)

Rangi42 avatar Jan 24 '21 01:01 Rangi42

Note that square brackets collide with memory accesses.

I don't think this is the case. Building #716 with T_LBRACK/T_RBRACK instead of T_RBRACK/T_2RBRACK does not produce any shift/reduce or reduce/reduce conflicts, and editing the test cases to use single brackets works as expected.

This is because that PR only allows inline fragments in n16 values, not just any numeric values. So call [fragment] and jp z, [fragment] and dw [fragment] are all valid, and ld a, [hl] is valid, but there are no instructions like "ld hl, [n16]`" to cause ambiguity.

Should single brackets be used instead?

Note that single brackets would make these equivalent:

	ld a, [[db 42]]

	ld a, [.liff]
.liff:	db 42

Rangi42 avatar Feb 12 '21 02:02 Rangi42

The MIDAS assembler's feature like this was called "constants".

From http://www.bitsavers.org/pdf/mit/rle_pdp1/memos/PDP-1_MIDAS.pdf:

The constant word and surrounding parens are treated as a single syllable whose value is the address of a register contalning the constant word. Constants may be used in constants. The following two program fragments are equivalent:

add (add (20)-lio-(30
...
constants
add a
...
a, add b-lio-c
b, 20
c, 30

Rangi42 avatar Feb 14 '21 22:02 Rangi42

Note that besides the PDP-10's MIDAS assembler, ASMotor itself has had these since 2019:

Speaking of strings, code and data literals can be used to reduce clutter and improve readability. To load the register a0 with the address of a string, you might do

lea { DC.B "This is a string",0 },a0

or to produce the address of a chunk of code

jsr {
	moveq #0,d0
	rts
}

(It doesn't allow {interpolation} outside of strings, so curly braces were available for that.)

Rangi42 avatar Dec 23 '23 00:12 Rangi42