rgbds icon indicating copy to clipboard operation
rgbds copied to clipboard

Add constants to sym file as well

Open ISSOtm opened this issue 5 years ago • 3 comments

What makes the SYM file useful is that it's not easy to know where a given label is located; however, not all constants are obvious, as in the following example.

SECTION "Gfx", ROMX
Gfx:
INCBIN "gfx.2bpp"

GFX_SIZE equ @ - Gfx ; What value do I have?
    PRINTT "{GFX_SIZE}\n" ; This can help, but it's easy to lose it in a busy `make` invocation

For compatibility with existing implementations, maybe output them in comments, at the end of the file, with a special comments preceding them?

Note that this ties to #483.

ISSOtm avatar Oct 14 '20 14:10 ISSOtm

If something like this gets added, I think constants would be more useful in definition order, unlike labels which get ordered by bank:address value.

Many constants will probably be small values like 1 or 2, and it's not beneficial to keep all of those together. On the other hand, constants may be defined in a sequence (ID lists, struct field offsets, jumptable indexes, etc) and those are easier to read all together. (Ideally they would all have a shared prefix to easily sort them together after the fact, but I've seen shared suffixes, or just nothing.)

This format would keep all names aligned:

00:4000 Gfx
;; 0320 GFX_SIZE

Rangi42 avatar Oct 14 '20 14:10 Rangi42

Interestingly, we do have a good example of this kind of thing gone wrong. When the Pokémon leaks started, a file called CRYSTAL_BY_NUM.SYM was shared, which was a symfile for some version of the Japanese game sorted in ascending order of values. However, the official toolchain treated exported constants as symbols with a bank number of zero, and thus that file contains many constants in addition to actual labels.

Out of 9,383 symbols contained in that file:

  • 74 (0.79%) have a value of zero.
  • 711 (7.58%) have a value between 1 and 9.
  • 1,985 (21.16%) have a value between 10 and 49.

In other words, 30% of the file is unsorted noise. I don't expect most projects to do much better, since most constants are just small values that will easily become unsorted noise if the symfile doesn't preserve their declaration order. Not to mention no tool could make any use out of these values, since the tools can't possibly know which one (if any) out of the many constants valued at 4 was used for a particular ld a, 4 instruction once it has been assembled.

If you want to export constants, I'd recommend using a separate file for this purpose, perhaps containing some additional metadata that would make it more useful (i.e., easier to search or to sort).

aaaaaa123456789 avatar Oct 14 '20 21:10 aaaaaa123456789

Proposal: let rgbasm take a -n constants.txt argument, to which it will output every EQU, SET, or RSSET constant in the order they were last defined. (So FOO = 1, then BAR EQU 2, then FOO = 3 would output BAR before FOO.)

The format could be similar to rgblink's -n symbols.sym file: value in hex, space, name. Values are 32-bit so they could all be zero-padded to 8 digits, but most constants will probably be in the 8- or 16-bit range so maybe don't pad them at all. Or, pad them to an even number of digits. (If this were a file format spec then padding, like order, would be irrelevant, but I'm thinking about what might be most convenient to read.)

Example:

; File generated by rgbasm
8000 VRAM_Begin
a000 VRAM_End
00 SRAM_DISABLE
0a SRAM_ENABLE
00 DECOATTR_TYPE
01 DECOATTR_NAME
...

Rangi42 avatar Mar 02 '22 23:03 Rangi42