rgbds Lists/arrays

MOVES EQU [$24, $3a]

    db LENGTH(MOVES)
    db MOVES[0], MOVES[1]

Is there any chance for something like this to happen at all? Is it even a good idea?

Should be made to work for strings too.

POKEMON_NAMES EQU ["BULBASAUR", "IVYSAUR"]

    db "So you want ", POKEMON_NAMES[STARTER1], "?@"

May 05 '15 14:05 Sanqui

An alternative with post-0.4.2 {interpolation} is to define many symbols with the "array index" as a suffix.

list_equ: MACRO
x EQUS "\1"
i = 0
SHIFT
REPT _NARG
; define `list_equs` too with EQUS here
{x}#{d:i} EQU \1 ; could have used '_' but '#' is valid in names
i = i + 1
SHIFT
ENDR
LENGTH_{x} EQU i
PURGE x
ENDM

	list_equ MOVES, $24, $3a

	db LENGTH_MOVES
	db MOVES#0, MOVES#1

	list_equs POKEMON_NAMES, "BULBASAUR", "IVYSAUR"

	db "So you want {POKEMON_NAMES#{d:STARTER1}}?@"

pokered's macros/scripts/maps.asm works somewhat like this, e.g. def_warps followed by many warps, and then def_warps_to iterates over all the defined warps. (Those macros will be simplified with the next rgbds release.)

def_warps: MACRO
REDEF _NUM_WARPS EQUS "_NUM_WARPS_\@"
	db _NUM_WARPS
_NUM_WARPS = 0
ENDM

warp: MACRO
	db \2, \1, \3, \4
REDEF _WARP_TO_NUM_{d:{_NUM_WARPS}} EQUS "warp_to \1, \2, _WARP_TO_WIDTH"
_NUM_WARPS = _NUM_WARPS + 1
ENDM

def_warps_to: MACRO
_WARP_TO_WIDTH = \1_WIDTH
	FOR N, _NUM_WARPS
		_WARP_TO_NUM_{d:N}
	ENDR
ENDM


	def_warps
	warp 14,  0, 5, LAST_MAP
	warp 14,  2, 1, SS_ANNE_1F

	def_warps_to VERMILION_DOCK

Jan 02 '21 07:01 Rangi42

Here are some macros for working with lists (using features of post-0.5.0 rgbasm master): numeric lists https://pastebin.com/Ucngn8Pt and string lists https://pastebin.com/WMrJdSKk

	list MOVES, $24, $3a
	db LENGTH_MOVES
	db MOVES#1, MOVES#2

	slist POKEMON_NAMES, "BULBASAUR", "IVYSAUR"
	db "So you want {POKEMON_NAMES#{d:STARTER1}}?@"

	slist MONS
	slist_item "Squirtle"
	slist_item "Bulbasaur"
	slist_item "Charmander"
	slist_sort MONS
	slist_println MONS ; ["Bulbasaur", "Charmander", "Squirtle"]
	println LENGTH_MONS ; $3

	slist_copy STARTERS, MONS
	slist_append STARTERS, "Pikachu", "Eevee"
	slist_println STARTERS ; ["Bulbasaur", "Charmander", "Squirtle", "Pikachu", "Eevee"]
	slist_replace STARTERS, "Pikachu", "Raichu"
	println "{STARTERS#4}" ; Raichu

	slist GEN2MONS, "Chikorita", "Cyndaquil", "Totodile"
	slist_delete STARTERS, 4
	slist_remove STARTERS, "Eevee"
	slist_extend STARTERS, GEN2MONS
	slist_purge GEN2MONS
	slist_println STARTERS ; ["Bulbasaur", "Charmander", "Squirtle", "Chikorita", "Cyndaquil", "Totodile"]

	slist_purge MONS
	assert !DEF(LENGTH_MONS)
	slist MONS, "Rattata", "Pidgey", "Pidgey", "Spearow", "Pidgey", "Spearow"
	slist_find N, MONS, "Pidgey"
	println N ; $2
	slist_rfind N, MONS, "Pidgey"
	println N ; $5
	slist_count N, MONS, "Pidgey"
	println N ; $3
	slist_remove_all MONS, "Pidgey"
	slist_set MONS, 2, "Fearow"
	slist_insert MONS, 3, "Raticate"
	slist_reverse MONS
	slist_println MONS ; ["Spearow", "Raticate", "Fearow", "Rattata"]

Jan 02 '21 19:01 Rangi42

That actually demonstrates a possible use case for REDEF EQU versus using SET: it would prevent the user from directly changing a constant, while letting the appropriate macros conveniently do it.

__len = LENGTH_\1
PURGE LENGTH_\1
LENGTH_\1 EQU __len + 1
PURGE __len

; vs

REDEF LENGTH_\1 EQU LENGTH_\1 + 1

Jan 02 '21 22:01 Rangi42

Bump @Sanqui, what do you think of Rangi's suggested alternatives?

Apr 20 '21 13:04 ISSOtm

Well, it's sort of hacky, would have been nice to have some syntactical sugar for this whole thing, but it would work for me, I think. I can't help but wonder if dictionaries would be possible with this approach too, to possibly enable loading entire JSON-like structures.

Apr 22 '21 14:04 Sanqui

Struct support has been requested (#98), is shimmed, but has native support planned because rgbds-structs is a big pile of spaghetti hacks.

Apr 22 '21 14:04 ISSOtm

We were discussing native arrays in #rgbds. They would be useful as a return value from a hypothetical READBIN function to read the bytes of a file (as opposed to a READFILE reading the contents as a string, since the functions dealing with strings expect UTF-8 and would terminate on $00 bytes.)

Some possible syntaxes for array literals:

Brackets: [1, 2, 3] would be concise and familiar, but might be grammatically ambiguous (I'm not sure, since unlike strings, arrays wouldn't be usable as relocexprs)
Function: ARRAY(1, 2, 3) would be easy to implement and not introduce new syntax or punctuation, but is verbose
Sigil: #[1, 2, 3] would be unambiguous and concise, but looks weird

db, dw, and dl should work with arrays just like with strings, applying to each element of the array.

Most string functions would usefully have array counterparts, though I don't know if they should start with "ARRAY" or just "ARR" (I think "ARRAY" is more readable):

ARRAYLEN(arr): Returns the length of arr.
ARRAYVAL(arr, i): Returns the ith value in arr (1-indexed for consistency with strings; this would support negative indexes too like STRSUB). (Other names: ARRAYITEM, ARRAYNUM, ARRAYELEM, ARRAYAT?)
ARRAYCAT(arrs...): Concatenates arrs.
ARRAYIN(arr, val): Returns the first position of val in arr, or zero if it's not present.
ARRAYRIN(arr, val): Returns the last position of val in arr, or zero if it's not present.
ARRAYSUB(arr, pos, len): Returns a sub-array of arr, like arr[pos:pos+len] in Python.

Macros like in list.asm could take care of more advanced array manipulation, like counting a value, removing/replacing the first/last/all of a value, sorting, reversing, etc. If any of them are found to be particularly useful, they can always be added in a later release. (Even the ARRAYIN/ARRAYRIN functions could be omitted, since for loops are sufficient and they might be rarely used.)

A question: how to assign an identifier to an array? DEF arr EQUA [1, 2, 3]?

There's a notable difference between arrays and strings. If you have DEF s EQUS "hello", you can't do STRLEN(s) because of string expansion; you have to do STRLEN("{s}"). This involves more typing but prevents you at the grammar level from saying STRLEN(x) for any identifier x which could be a number, label, undefined, etc. On the other hand, if you've defined arr as an array (somehow), I'm not sure how it should behave:

Should ARRAYLEN(arr) just work? Then would we rely on a runtime error/abort if you do ARRAYLEN(x) for some numeric/string/etc x?
Or should "array equates" act like string equates and expand during lexing? (I'd rather they not, we don't need to add more lexer-time special behavior.)
Maybe arrays should have a separate namespace from other identifiers? They could all start with #, not just literals. So DEF #arr2 EQUA ARRAYCAT(#arr1, #[4, 5, 6]) would be grammatical and DEF arr EQUA #[1, 2, 3] would not. But that could look bad with #s everywhere.

This proposal also doesn't address arrays of strings, which could be at least as useful. Example: have an array of all monster names, and in texts discussing MON_FOO, concatenate the value of the MON_FOOth entry of MON_NAMES. And some table of monster names would just be for i, ARRAYLEN(MON_NAMES) / db ARRAYVAL(MON_NAMES, i+1) / endr.

Nov 11 '21 22:11 Rangi42

A few unsorted comments on your comment:

The only meaningful thing you can do with an array of strings is emit those strings, and that's already doable with EQUS expansion. There's no immediate need to support them.
I wouldn't oppose to arrays having a namespace of their own. If they don't have one, then yes, they should act as their own data type and be accepted directly as arguments to functions that take array expressions. (What is an array expression, anyway? That would be an interesting question to answer.)
If arrays and strings are separate, it would be nice to be able to convert between them. This only requires three functions: STRCHARS(str) that converts a valid string into an array of UTF-8 codepoints, ARRAYSTR(arr) that does the opposite, and STRENCODE(str) that converts a string through the charmap into the raw data it would output. (This last function has no inverse, as charmap conversions aren't in general reversible.) Note the difference between STRCHARS (which is invertible and has a known encoding, making it ideal for metaprogramming) and STRENCODE (non-invertible and using the target's encoding, making it ideal for data generation).
1-based indexing for arrays is evil and it should never be even considered. Even if strings use it. There's no reason to make that mistake twice, and it's a meme in programming circles for a reason.

Nov 12 '21 02:11 aaaaaa123456789

An array expression is either an array literal or a built-in function call that returns an array. Just like how in parser.y a string is a T_STRING or a call to T_OP_STRSUB, T_OP_STRCAT, etc.
Those sound like reasonable functions, though I'd call them STRARRAY (so STRARRAY("<PK>") => ARRAY($3C, $50, $4B, $3E)), ARRAYSTR (so ARRAYSTR(ARRAY($41, $42, $43)) => "ABC"), and CHARARRAY (so CHARARRAY("<PK>") => ARRAY($E1) since charmap "<PK>", $e1). (Since the current STR* functions take strings and the CHAR* functions dealing with charmap values.) Although I'm not certain all or any of those are necessary, at least not in an initial release with basic MVP arrays. Assuming arrays are added at all. STRCHARS/STRARRAY can be accomplished with a FOR loop and STRSUB, STRENCODE/CHARARRAY with a FOR loop and CHARSUB, and ARRAYSTR sounds suspicious since array values might not all be valid Unicode code points.
I wish rgbasm had zero-indexing from the beginning, but it doesn't, and would much rather have STRSUB etc act consistently with ARRAYSUB etc. It's not without precedent: plenty of languages use 1-indexing, including many math-oriented ones (Fortran, Matlab, Mathematica, R, Julia).

Nov 12 '21 03:11 Rangi42

It's true that plenty of languages use 1-based indexing. It's also true that languages are nearly universally hated by programmers for doing so, and the only reason they do it is because they are math- or science-oriented and scientists and mathematicians tend to count from 1. String indexing is a relatively rare operation, while array indexing is the only reason you'd ever use arrays in the first place, so getting it right for arrays is a lot more important, to the point it probably trumps the need for consistency.

Nov 12 '21 03:11 aaaaaa123456789

ARRAYCAT is potentially redundant, if we allow the ARRAY "constructor" to automatically flatten arrays. So you have DEF a1 EQUA ARRAY(1,2,3), then DEF a2 EQUA ARRAY(a1,4,5,6,ARRAY(7,8,9),10), and then a2 is ARRAY(1,2,3,4,5,6,7,8,9,10).

Nov 12 '21 03:11 Rangi42

A less serious but not entirely joking suggestion: once we have user-defined functions, we could add ARRAYMAP(arr, fn) to apply fn to each element of arr, and ARRAYFILTER(arr, fn) to select only the elements of arr for which fn returns nonzero/true. Or even ARRAYREDUCE(arr, fn, init=0) to apply a reducing function (e.g. if DEF plus(x, y) = x + y, then ARRAYREDUCE([1,2,3], plus) == 6, and ARRAYREDUCE([], plus, 42) == 42).

Nov 12 '21 03:11 Rangi42

A less serious but not entirely joking suggestion: once we have user-defined functions, we could add ARRAYMAP(arr, fn) to apply fn to each element of arr, and ARRAYFILTER(arr, fn) to select only the elements of arr for which fn returns nonzero/true. Or even ARRAYREDUCE(arr, fn, start=0) to apply a reducing function (e.g. if DEF plus(x, y) = x + y, then ARRAYREDUCE([1,2,3], plus) == 6).

Definitely useful.

(Ed: I removed the email chains from these comments. --Rangi)

Nov 12 '21 03:11 aaaaaa123456789

rgbds rgbds copied to clipboard

Lists/arrays

rgbds
rgbds copied to clipboard