rgbds
rgbds copied to clipboard
[Feature request] Don't stop assembling/linking immediately when max sizes are exceeded
Best shown with an example of current behavior:
[06:27:34] ax6@n2 ~/Desktop/temp $ cat > test.asm
SECTION "test", ROMX[$4000], BANK[1]
rept 20000
db 0
endr
[06:28:04] ax6@n2 ~/Desktop/temp $ rgbasm test.asm -o test.o
ERROR: test.asm(2) -> test.asm::REPT~16385(3):
Section 'test' is too big (max size = 0x4000 bytes, reached 0x4001).
[06:28:10] ax6@n2 ~/Desktop/temp $ cat > test.asm
rept 1000
SECTION "test\@", ROM0
rept 20
db 0
endr
endr
[06:29:40] ax6@n2 ~/Desktop/temp $ rgbasm test.asm -o test.o
[06:29:47] ax6@n2 ~/Desktop/temp $ rgblink test.o -o test.gb
error: Unable to place "test_3612" (ROM0 section) anywhere
These errors are severely misleading, because if you fix the exact location of the error, you get a new error right away, on a different line. When it comes to helping newbies editing their code, this problem is a nightmare, because they spend hours upon hours chasing the error.
There's no need for the assembler and the linker to stop right away. These errors can be reported at the end of the tool's run. That would let people spot and fix all their errors at once, instead of doing it slowly, one at a time.
The proposed behavior is:
- [ ] When assembling, keep going until the end of the section is reached. It doesn't matter if the addresses become invalid (even if it goes past $FFFF; just wrap around). After assembling the entire input, if any sections are too large, report all of them and fail.
- [ ] When linking, if a section can't be placed anywhere, add it to a list of rejected sections and keep going. After linking all the object files, if there are any rejected sections, report them all and fail.
In short, don't fail at the first opportunity. Report all errors at the end.
I think RGBLINK's behavior can be altered as you suggest; RGBASM's however is largely due to how it generates SECTIONs. The comment explains it best.
If that sanity check is needed, couldn't it be set to some silly value like 1 MB?
No, if we're changing the behavior, then it should be done right by preventing all further output to that section.
PR #877 partially addressed this, so assembly continues even after a section has become too large, but it does not update the linker, nor change the error messages to report the actual number of excessive bytes.
(I'd be fine with just not reporting an excessive byte count. For example, print "Section 'test' is too big (max size = 0x4000 bytes)" instead of "max size = 0x4000 bytes, reached 0x4001" (wrongly implies it's just 1 byte too large) or "max size = 0x4000 bytes, reached 0x4e20" (has to keep count of excessive-but-not-outputted bytes).)
I don't think there's much progress here. There's no reason to halt all output at 32KB or less; we're not in 1994 and we don't have to be careful to avoid overflowing a 3½" floppy. There's a good reason to have a sanity check at all, but it should be set at a high enough value that it would only trap actual infinite loops and/or malicious usage of the tool.
Knowing the size of the overflow is much, much more valuable than trapping it early to avoid the horror of having an object file that is 33KB too big. This is the reason why every modern compiler keeps going after the first error, even though it knows well that the (non-error) output is worthless.
Unless there is a syntax error, modern compilers can still process the AST; RGBASM cannot do that for the same reason as usual. This is why it's especially difficult to work on this issue.
That's completely understandable. But the issue here is the early stop when the max size is exceeded. You've essentially said that the only two options are to stop immediately on excess or to go on forever until the section ends. This, of course, can lead to malicious usage and/or very undesirable bugs, and nobody wants a 20 GB object file, so it's understandable that "go on forever" would never be a valid option. But those aren't the only two options. You can stop well before 20 GB without having to stop immediately. That would result in useful information for real use cases (where the excess would be in the range of kilobytes at most) while still preventing huge output files. Yes, that 40 KB section you've assembled and output is useless, because it's going to fail on a fatal error before assembly finishes, but that doesn't matter — that amount of wasted memory/temporary disk space is really not a concern for anyone.
I'd thought it would be feasible to change the output functions so, instead of writing to an object file, they just increment a counter of bytes past the maximum, which then gets reported as an error at the end if it's nonzero. Shouldn't that prevent giga-size objects without any size limit besides the bank sizes?
Much of the codebase assumes that section sizes are valid, so I don't think it's feasible to push section sizes beyond $FFFF bytes, if going past $8000 isn't gonna crash anything in the first place.