bmc64 icon indicating copy to clipboard operation
bmc64 copied to clipboard

Screen corruption with Super Snapshot

Open GaborC64 opened this issue 4 years ago • 36 comments

Characters on the screen occasionally turn into blocks during/after disk operations with the Super Snapshot cartridge image. Restore button or repeated disk operations may bring them back. The same SS .crt works fine in TheC64, Vice on a PC and Vice in RetroPie on a RPi-3B.

To replicate in BMC64 (on a RPi-3B and RPi-1Br1.2):

  1. Load Super Snapshot 5.22 v2 (.crt) (https://rr.pokefinder.org/wiki/Super_Snapshot)
  2. Load any disk image (.d64)
  3. Press $ or @ repeatedly (disk operations)

I've tried different combinations of REU on/off, drive models, True Emulation, IEC Filesystem, HDMI settings, in BMC64-3.6 and 05/05/21 master build, but nothing helps.

Screenshot: https://postimg.cc/VdxbBJcF

GaborC64 avatar May 30 '21 21:05 GaborC64

It's pretty likely this is a bug in the underlying VICE core. Have you tried searching the VICE issues log to see if it was previously reported/fixed? Backporting a fix may not require a lot of effort if it can be identified.

rhester72 avatar May 31 '21 02:05 rhester72

I cannot find any Vice issues describing the same problem. The fact that the bug is only on BMC64 and not on Windows Vice or TheC64 suggests that this is a BMC64 specific issue. The latest 3.7 has not fixed the bug.

GaborC64 avatar Jun 07 '21 17:06 GaborC64

I can reproduce this in Vice 3.4 on my Linux desktop build. Sometimes the characters turn to solid blocks simply using $ for a dir listing. It's random.

(Make sure you are using x64 and not x64sc to compare)

It looks like a VICE problem to me.

On Mon, Jun 7, 2021 at 1:40 PM GaborC64 @.***> wrote:

I cannot find any Vice issues describing the same problem. The fact that the bug is only on BMC64 and not on Windows Vice or TheC64 suggests that this is a BMC64 specific issue. The latest 3.7 has not fixed the bug.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/randyrossi/bmc64/issues/175#issuecomment-856133891, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI3HKCK4NSLJTKULRSRU6TTRUABTANCNFSM45ZX2TLA .

-- Randy Rossi

  • "There are only two things that are hard about computer science; Naming things, Cache Invalidation, and Off-by-one errors."

randyrossi avatar Jun 07 '21 18:06 randyrossi

I get it, this glitch manifests on BMC64 because it uses an older Vice version with a bug. The x64 is a good lead so I'll keep looking, but as it is deprecated, the chances are low for a bugfix. Any tips on where to look for the Vice bug would be most welcome.

GaborC64 avatar Jun 07 '21 19:06 GaborC64

I've just reinstalled RetroPi 4.7.1 on the RPi-3B to check. After installing Vice from binary, it uses vice-x64 and doesn't have the glitch with Super Snapshot. Does this give us hope?

GaborC64 avatar Jun 07 '21 23:06 GaborC64

I checked the fetch of char pointers and they seem to be fine even when the blocks are shown. Could be in the chargen lookup or maybe drawing. But it might be a lot of work to track down.

randyrossi avatar Jun 08 '21 13:06 randyrossi

I finally got a chance to take a better look at this - even on 3.3 (the version of VICE BMC64 is currently based on), the cycle-exact emulator does not exhibit the problem, suggesting this cartridge requires cycle-exact to function properly (something BMC64 can never achieve). It doesn't appear to be a VICE bug as such, just one of those rare pieces of software that needs cycle-exact behavior.

rhester72 avatar Dec 20 '21 15:12 rhester72

Thanks for looking into this (I experienced the same glitch). Maybe sometime, a future Raspberry PI5 or 6 could have enough power to cycle-exact BMC64.

Eddepad avatar Dec 21 '21 10:12 Eddepad

Thanks for taking a look. The problem is gone in Vice 3.5 x64 (non-cycle-exact) installed from binary today inside RetroPie 4.7.1. The stable BMC64 is based on Vice 3.3, the master build is based on 3.4 (afaik). Are you planning to have a future BMC64 based on 3.5 x64 which should be free from this bug?

GaborC64 avatar Dec 24 '21 01:12 GaborC64

Are you sure RetroPie has x64 and not x64sc, just renamed or softlinked? I ask because x64 was removed entirely from the Makefile pre-3.5:

https://fossies.org/linux/misc/vice-3.5.tar.gz/vice-3.5/Makefile.am

and is no longer supported at all after 3.4.

Master is still VICE 3.3 - I took a swing at bringing it up to 3.4 about a year ago, but got a bit lost in the 64 filter code and managed to badly misinterpret new keyboard input parameters, so it stalled out for want of some of Randy's time because he's very focused on the VIC II replacement. The code still exists and was preserved, but if you're right and this was a post-3.4 change that made something work better, that wouldn't help much anyway.

It's VERY regrettable that pokefinder shut down the nightly WIndows builds, because that was the fastest way to narrow down very specific changes that altered behavior. Without those, it's down to hand-compiling build after build after build, and I no longer even have a working Windows mingw build environment (and haven't for years).

If anyone's able to a) confirm that a manual compile of x64 (it just needs to be added back to the Makefile) on the 3.5 release code base does indeed fix the problem, and b) is willing to then do several dozen builds and tests to isolate the exact commit that fixed it, I'd be delighted to integrate the required changes once identified into BMC64. I was going to go after pokefinder over the holiday break to see if I could figure it out that way, but natch...it is no more. :/

rhester72 avatar Dec 24 '21 03:12 rhester72

VICE still provides development Windows builds, just not via pokefinder anymore, they're on our SVN mirror. Those snapshots are built after each commit, rather than once per day. They do however not contain the old x64 emulator.

As for Super Snapshot 5, I noticed some fixes in r39834, which might be the stuff you're after. I've checked current x64 (on Linux) with SS 5.22 and failed to trigger any screen corruption while repeatedly using '$<enter>' or '@<enter>', so you could look into applying those specific fixes in your code.

There are detailed instructions on setting up msys2 to build VICE for Windows here, should you want to build VICE yourself. Though in my experience on Windows it's actually faster to use a Linux VM to do multiple compiles to narrow down a specific revision that fixed or broke something since Windows is terrible at process spawning, something configure and make do a lot.

Compyx avatar Dec 24 '21 09:12 Compyx

(rhester72): Are you sure RetroPie has x64 and not x64sc, just renamed or softlinked?

RetroPie 4.7.1 seems to have x64. Having installed Vice in RetroPie from binary or source (these are menu options), /opt/retropie/emulators/vice/bin has x64, x64dtv, x64sc, x128, etc. and vice-x64 can be selected from a menu (see https://retropie.org.uk/docs/Commodore-64-VIC-20-PET/). The Help/About in Vice shows Version 3.5 rev 40565 SDL2 and it does not have the glitch with Super Snapshot.

Unfortunately I don't have the skills to work out exactly what makes the glitch go away in this Vice revision and then inject that into BMC64.

Compyx, thanks for the info, that might help someone who can build Vice, but again I'm just an amateur.

Merry Christmas!

GaborC64 avatar Dec 24 '21 21:12 GaborC64

@Compyx Good to see you - thank you! I was after older pokefinder snapshots specifically because I was hoping to identify the 'fixed build' before x64 was dumped, since it happened in the transition between 3.4 and 3.5.

Unfortunately, reverse-applying r39834 did not provide relief. It does seem to take longer to trigger in BMC64, but I think that's entirely coincidental.

Agree that Linux would be a better platform for debugging, but unfortunately all my Linux boxes are headless, so msys2 it is. I'll start with a full 3.5 release build of x64 to confirm the problem truly doesn't reproduce there (trust, then verify! ;), then work my way backwards on half-cuts of commits until I can at least get the timeframe, then go from there.

I was REALLY hoping for a Christmas miracle release of BMC64 tonight that contained the fix...sorry, @GaborC64!

rhester72 avatar Dec 24 '21 23:12 rhester72

x64 in 3.5 really is OK, so something got fixed somewhere. :)

I can't get 3.4 to build at all on msys2, things blow up royally with hvsc. @Compyx , any thoughts?

rhester72 avatar Dec 25 '21 05:12 rhester72

You can run into the occasional revision that doesn't compile, but I'd expect a point release to compile at least. It could be it compiled with the msys2 packages that were current back then but no longer compiles with current msys2, it could also depend on the configure switches used.

Can you post the error message with some lines of context? And your configure line?

Compyx avatar Dec 25 '21 09:12 Compyx

Agree with all your points...a bit confused by this one.

Once it starts compiling hvsc, there's about 10,000 lines like this, all complaining about the same function redef:

C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/11.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../src/arch/gtk3/libarch.a(uimedia.o):C:\msys64\home\rhest\vice-3.4\src\arch\gtk3/../../../src/arch/gtk3/widgets/base/carthelpers.h:40: multiple definition of `carthelpers_can_flush_func'; ../src/arch/gtk3/libarch.a(uicart.o):C:\msys64\home\rhest\vice-3.4\src\arch\gtk3/../../../src/arch/gtk3/widgets/base/carthelpers.h:40: first defined here

Configure is boilerplate:

./configure -C --enable-native-gtk3ui --disable-arch

rhester72 avatar Dec 25 '21 17:12 rhester72

It's not actually the hvsc directory, but the linking stage, the 'hvsc' dir happens to be the last directory the compiler visits before make starts linking. After way too much time spent on trying to fix the build system in old revisions, I found the reason linking fails: GCC starting with 10.0 uses -fno-common, meaning any declaration of an object in a header file missing the extern keyword will result in multiple definitions, making linking fail. (see https://gcc.gnu.org/gcc-10/porting_to.html)

So instead of having to fix the missing extern keywords in various files, we have to pass -fcommon to GCC in the configure line:

./configure CFLAGS="-fcommon" --enable-native-gtk3ui --enable-x64

(you need --enable-x64 to actually build x64)

With that out of the way, I finally managed to build x64 from r37296 (3.4 point release), and indeed SS5 randomly freezes the machine. Even got it to freeze once while showing its boot screen :)

Compyx avatar Dec 25 '21 22:12 Compyx

hahahaha I wish I'd looked at the configure script, I was hacking Makefiles by hand to get x64 ;)

Thanks much - this was the last piece of the puzzle I needed! Plan from here is pretty rote:

  • Pick the midpoint between 3.4 and 3.5, build and test
  • Pass? Go backwards half the distance and test again
  • Fail? Forwards half the distance and test again
  • Rinse and repeat until I've identified the commit by behavior alone

Maddeningly time-consuming, but the most logical approach I've got, and I'm annoyed enough from this by now that I'm going to see it through for the sheer satisfaction of doing so. laughs

Thanks again!

rhester72 avatar Dec 25 '21 22:12 rhester72

You're welcome :)

And that approach is the same I do, "binary search" for a revision that either broke or fixed something.

I noticed some revisions will not accept --enable-x64, in that case you can omit it, that revision will still build x64 by default. So far r38000 appears to work correctly with SS 5.22, couldn't get it to glitch, but you may want to confirm this. r35500 definitely glitched/froze. So you might at least have a starting point saving you a few builds.

If you do find the revision that fixed SS5, keep in mind there might be previous commits involved in fixing the bug, so backporting a single commit might not fix the issue in BMC64. But it'll at least be a start.

Compyx avatar Dec 25 '21 23:12 Compyx

Thank you both, it's wonderful to see this Christmas miracle in the "making"! @rhester72 do let me know if I can help from the sideline with testing etc., I keep an eye on this thread/emails with anticipation.

GaborC64 avatar Dec 26 '21 01:12 GaborC64

I've gone as low as r37561 so far and it's fine, so the fix was shortly after 3.4 was released.

Continuing the hunt tomorrow...

rhester72 avatar Dec 26 '21 03:12 rhester72

Well, the release notes for 3.6 (!) also mentions:

  • Super Snapshot V5 fixes. I went down the last eight weeks of commits but can't find it. Why does Sourceforge have no good search functionality?

Kugelblitz360 avatar Dec 26 '21 13:12 Kugelblitz360

@Kugelblitz360 I think that was https://sourceforge.net/p/vice-emu/code/39834/ that @Compyx referred to earlier. I've applied it internally (and will keep it), but it made no difference here.

rhester72 avatar Dec 26 '21 16:12 rhester72

The glithcy blocks on the screen follow a nice numerically determined pattern as described at https://www.lemon64.com/forum/viewtopic.php?t=70498&start=1756. Once it's found, it would be interesting to see how the offending code causes such a regular pattern.

GaborC64 avatar Dec 26 '21 18:12 GaborC64

Well, the release notes for 3.6 (!) also mentions:

* Super Snapshot V5 fixes.
  I went down the last eight weeks of commits but can't find it. Why does Sourceforge have no good search functionality?

Because SourceForge sucks. And there's a year of commits between 3.5 and 3.6, almost 2000 of them ;)

Compyx avatar Dec 26 '21 20:12 Compyx

Closer. r37428 fails. Within 135 commits, but progress will be slow due to family demands on my time :)

rhester72 avatar Dec 26 '21 23:12 rhester72

@rhester72 family is first.

The glitch is that certain characters are replaced by blocks and others by space, but this is only true on a fresh start (BMC64 3.8). After loading and playing Beamrider (.d64 disk image), the directory listing (F3 is the SS shortcut for $ [enter]) glitches to something else, showing parts of the program: IMG_1394 Pressing Restore brings back the original listing: IMG_1395

GaborC64 avatar Dec 27 '21 02:12 GaborC64

The range continues to narrow:

r37561 OK r37527 FAIL

rhester72 avatar Dec 27 '21 03:12 rhester72

r37552 OK r37548 FAIL

rhester72 avatar Dec 27 '21 15:12 rhester72

And given what we know about it working on the cycle-exact core, my danger money is that it's this commit:

[r37549] by gpz this commit adds the missing dummy writes to the RMW instructions on the non-sc 6502 core. this makes the core even more sc-like, the access patterns are now equal to the sc core (but not the timing within instructions). in case of weird problems, revert this commit and see if that helps.

There's also the matter that all the other commits between that one and the known-working one were entirely cosmetic :)

Unfortunately, I won't have time to look at integrating this until this afternoon.

rhester72 avatar Dec 27 '21 15:12 rhester72