rgbds
rgbds copied to clipboard
Lists/arrays
MOVES EQU [$24, $3a]
db LENGTH(MOVES)
db MOVES[0], MOVES[1]
Is there any chance for something like this to happen at all? Is it even a good idea?
Should be made to work for strings too.
POKEMON_NAMES EQU ["BULBASAUR", "IVYSAUR"]
db "So you want ", POKEMON_NAMES[STARTER1], "?@"
An alternative with post-0.4.2 {interpolation} is to define many symbols with the "array index" as a suffix.
list_equ: MACRO
x EQUS "\1"
i = 0
SHIFT
REPT _NARG
; define `list_equs` too with EQUS here
{x}#{d:i} EQU \1 ; could have used '_' but '#' is valid in names
i = i + 1
SHIFT
ENDR
LENGTH_{x} EQU i
PURGE x
ENDM
list_equ MOVES, $24, $3a
db LENGTH_MOVES
db MOVES#0, MOVES#1
list_equs POKEMON_NAMES, "BULBASAUR", "IVYSAUR"
db "So you want {POKEMON_NAMES#{d:STARTER1}}?@"
pokered's macros/scripts/maps.asm works somewhat like this, e.g. def_warps followed by many warps, and then def_warps_to iterates over all the defined warps. (Those macros will be simplified with the next rgbds release.)
def_warps: MACRO
REDEF _NUM_WARPS EQUS "_NUM_WARPS_\@"
db _NUM_WARPS
_NUM_WARPS = 0
ENDM
warp: MACRO
db \2, \1, \3, \4
REDEF _WARP_TO_NUM_{d:{_NUM_WARPS}} EQUS "warp_to \1, \2, _WARP_TO_WIDTH"
_NUM_WARPS = _NUM_WARPS + 1
ENDM
def_warps_to: MACRO
_WARP_TO_WIDTH = \1_WIDTH
FOR N, _NUM_WARPS
_WARP_TO_NUM_{d:N}
ENDR
ENDM
def_warps
warp 14, 0, 5, LAST_MAP
warp 14, 2, 1, SS_ANNE_1F
def_warps_to VERMILION_DOCK
Here are some macros for working with lists (using features of post-0.5.0 rgbasm master): numeric lists https://pastebin.com/Ucngn8Pt and string lists https://pastebin.com/WMrJdSKk
list MOVES, $24, $3a
db LENGTH_MOVES
db MOVES#1, MOVES#2
slist POKEMON_NAMES, "BULBASAUR", "IVYSAUR"
db "So you want {POKEMON_NAMES#{d:STARTER1}}?@"
slist MONS
slist_item "Squirtle"
slist_item "Bulbasaur"
slist_item "Charmander"
slist_sort MONS
slist_println MONS ; ["Bulbasaur", "Charmander", "Squirtle"]
println LENGTH_MONS ; $3
slist_copy STARTERS, MONS
slist_append STARTERS, "Pikachu", "Eevee"
slist_println STARTERS ; ["Bulbasaur", "Charmander", "Squirtle", "Pikachu", "Eevee"]
slist_replace STARTERS, "Pikachu", "Raichu"
println "{STARTERS#4}" ; Raichu
slist GEN2MONS, "Chikorita", "Cyndaquil", "Totodile"
slist_delete STARTERS, 4
slist_remove STARTERS, "Eevee"
slist_extend STARTERS, GEN2MONS
slist_purge GEN2MONS
slist_println STARTERS ; ["Bulbasaur", "Charmander", "Squirtle", "Chikorita", "Cyndaquil", "Totodile"]
slist_purge MONS
assert !DEF(LENGTH_MONS)
slist MONS, "Rattata", "Pidgey", "Pidgey", "Spearow", "Pidgey", "Spearow"
slist_find N, MONS, "Pidgey"
println N ; $2
slist_rfind N, MONS, "Pidgey"
println N ; $5
slist_count N, MONS, "Pidgey"
println N ; $3
slist_remove_all MONS, "Pidgey"
slist_set MONS, 2, "Fearow"
slist_insert MONS, 3, "Raticate"
slist_reverse MONS
slist_println MONS ; ["Spearow", "Raticate", "Fearow", "Rattata"]
That actually demonstrates a possible use case for REDEF EQU versus using SET: it would prevent the user from directly changing a constant, while letting the appropriate macros conveniently do it.
__len = LENGTH_\1
PURGE LENGTH_\1
LENGTH_\1 EQU __len + 1
PURGE __len
; vs
REDEF LENGTH_\1 EQU LENGTH_\1 + 1
Bump @Sanqui, what do you think of Rangi's suggested alternatives?
Well, it's sort of hacky, would have been nice to have some syntactical sugar for this whole thing, but it would work for me, I think. I can't help but wonder if dictionaries would be possible with this approach too, to possibly enable loading entire JSON-like structures.
Struct support has been requested (#98), is shimmed, but has native support planned because rgbds-structs is a big pile of spaghetti hacks.
We were discussing native arrays in #rgbds. They would be useful as a return value from a hypothetical READBIN function to read the bytes of a file (as opposed to a READFILE reading the contents as a string, since the functions dealing with strings expect UTF-8 and would terminate on $00 bytes.)
Some possible syntaxes for array literals:
- Brackets:
[1, 2, 3]would be concise and familiar, but might be grammatically ambiguous (I'm not sure, since unlike strings, arrays wouldn't be usable asrelocexprs) - Function:
ARRAY(1, 2, 3)would be easy to implement and not introduce new syntax or punctuation, but is verbose - Sigil:
#[1, 2, 3]would be unambiguous and concise, but looks weird
db, dw, and dl should work with arrays just like with strings, applying to each element of the array.
Most string functions would usefully have array counterparts, though I don't know if they should start with "ARRAY" or just "ARR" (I think "ARRAY" is more readable):
ARRAYLEN(arr): Returns the length of arr.ARRAYVAL(arr, i): Returns the ith value in arr (1-indexed for consistency with strings; this would support negative indexes too likeSTRSUB). (Other names:ARRAYITEM,ARRAYNUM,ARRAYELEM,ARRAYAT?)ARRAYCAT(arrs...): Concatenatesarrs.ARRAYIN(arr, val): Returns the first position of val in arr, or zero if it's not present.ARRAYRIN(arr, val): Returns the last position of val in arr, or zero if it's not present.ARRAYSUB(arr, pos, len): Returns a sub-array of arr, likearr[pos:pos+len]in Python.
Macros like in list.asm could take care of more advanced array manipulation, like counting a value, removing/replacing the first/last/all of a value, sorting, reversing, etc. If any of them are found to be particularly useful, they can always be added in a later release. (Even the ARRAYIN/ARRAYRIN functions could be omitted, since for loops are sufficient and they might be rarely used.)
A question: how to assign an identifier to an array? DEF arr EQUA [1, 2, 3]?
There's a notable difference between arrays and strings. If you have DEF s EQUS "hello", you can't do STRLEN(s) because of string expansion; you have to do STRLEN("{s}"). This involves more typing but prevents you at the grammar level from saying STRLEN(x) for any identifier x which could be a number, label, undefined, etc. On the other hand, if you've defined arr as an array (somehow), I'm not sure how it should behave:
- Should
ARRAYLEN(arr)just work? Then would we rely on a runtime error/abort if you doARRAYLEN(x)for some numeric/string/etcx? - Or should "array equates" act like string equates and expand during lexing? (I'd rather they not, we don't need to add more lexer-time special behavior.)
- Maybe arrays should have a separate namespace from other identifiers? They could all start with
#, not just literals. SoDEF #arr2 EQUA ARRAYCAT(#arr1, #[4, 5, 6])would be grammatical andDEF arr EQUA #[1, 2, 3]would not. But that could look bad with#s everywhere.
This proposal also doesn't address arrays of strings, which could be at least as useful. Example: have an array of all monster names, and in texts discussing MON_FOO, concatenate the value of the MON_FOOth entry of MON_NAMES. And some table of monster names would just be for i, ARRAYLEN(MON_NAMES) / db ARRAYVAL(MON_NAMES, i+1) / endr.
A few unsorted comments on your comment:
- The only meaningful thing you can do with an array of strings is emit those strings, and that's already doable with
EQUSexpansion. There's no immediate need to support them. - I wouldn't oppose to arrays having a namespace of their own. If they don't have one, then yes, they should act as their own data type and be accepted directly as arguments to functions that take array expressions. (What is an array expression, anyway? That would be an interesting question to answer.)
- If arrays and strings are separate, it would be nice to be able to convert between them. This only requires three functions:
STRCHARS(str)that converts a valid string into an array of UTF-8 codepoints,ARRAYSTR(arr)that does the opposite, andSTRENCODE(str)that converts a string through the charmap into the raw data it would output. (This last function has no inverse, as charmap conversions aren't in general reversible.) Note the difference betweenSTRCHARS(which is invertible and has a known encoding, making it ideal for metaprogramming) andSTRENCODE(non-invertible and using the target's encoding, making it ideal for data generation). - 1-based indexing for arrays is evil and it should never be even considered. Even if strings use it. There's no reason to make that mistake twice, and it's a meme in programming circles for a reason.
- An array expression is either an array literal or a built-in function call that returns an array. Just like how in parser.y a
stringis aT_STRINGor a call toT_OP_STRSUB,T_OP_STRCAT, etc. - Those sound like reasonable functions, though I'd call them
STRARRAY(soSTRARRAY("<PK>") => ARRAY($3C, $50, $4B, $3E)),ARRAYSTR(soARRAYSTR(ARRAY($41, $42, $43)) => "ABC"), andCHARARRAY(soCHARARRAY("<PK>") => ARRAY($E1)sincecharmap "<PK>", $e1). (Since the currentSTR*functions take strings and theCHAR*functions dealing with charmap values.) Although I'm not certain all or any of those are necessary, at least not in an initial release with basic MVP arrays. Assuming arrays are added at all.STRCHARS/STRARRAYcan be accomplished with aFORloop andSTRSUB,STRENCODE/CHARARRAYwith aFORloop andCHARSUB, andARRAYSTRsounds suspicious since array values might not all be valid Unicode code points. - I wish rgbasm had zero-indexing from the beginning, but it doesn't, and would much rather have
STRSUBetc act consistently withARRAYSUBetc. It's not without precedent: plenty of languages use 1-indexing, including many math-oriented ones (Fortran, Matlab, Mathematica, R, Julia).
It's true that plenty of languages use 1-based indexing. It's also true that languages are nearly universally hated by programmers for doing so, and the only reason they do it is because they are math- or science-oriented and scientists and mathematicians tend to count from 1. String indexing is a relatively rare operation, while array indexing is the only reason you'd ever use arrays in the first place, so getting it right for arrays is a lot more important, to the point it probably trumps the need for consistency.
ARRAYCAT is potentially redundant, if we allow the ARRAY "constructor" to automatically flatten arrays. So you have DEF a1 EQUA ARRAY(1,2,3), then DEF a2 EQUA ARRAY(a1,4,5,6,ARRAY(7,8,9),10), and then a2 is ARRAY(1,2,3,4,5,6,7,8,9,10).
A less serious but not entirely joking suggestion: once we have user-defined functions, we could add ARRAYMAP(arr, fn) to apply fn to each element of arr, and ARRAYFILTER(arr, fn) to select only the elements of arr for which fn returns nonzero/true. Or even ARRAYREDUCE(arr, fn, init=0) to apply a reducing function (e.g. if DEF plus(x, y) = x + y, then ARRAYREDUCE([1,2,3], plus) == 6, and ARRAYREDUCE([], plus, 42) == 42).
A less serious but not entirely joking suggestion: once we have user-defined functions, we could add ARRAYMAP(arr, fn) to apply fn to each element of arr, and ARRAYFILTER(arr, fn) to select only the elements of arr for which fn returns nonzero/true. Or even ARRAYREDUCE(arr, fn, start=0) to apply a reducing function (e.g. if DEF plus(x, y) = x + y, then ARRAYREDUCE([1,2,3], plus) == 6).
Definitely useful.
(Ed: I removed the email chains from these comments. --Rangi)