LakeSnes icon indicating copy to clipboard operation
LakeSnes copied to clipboard

fix spc upload problems

Open dinkc64 opened this issue 1 year ago • 12 comments

Hi @angelo-wf, This pr fixes all of the spc transfer issues that I know of, the theory of why this is necessary is simple: enix game jams data into the spc as fast as it possibly can, and running a single opcode at a time causes the following problem, example... snes_catchupApu() needs to run 2 cycles, but the spc runs a whole opcode - of which would be 6 cycles long (0xfe). The next spc-opcode is a mov from the port. So, now the next write happens from cpu -> spc, and catching up apu will not run the spc at all since we are in debt by 4 cycles. Now the spc "misses" the new data, and instead has the old data loaded from the port... uhoh :)

There is a new problem with "Tales of Phantasia". The samples are no longer corrupted-sounding when they play individually, but when the game starts playing the music and the voice samples at the same time (during the fight) - everything gets garbled. It sounds like there is a possible overrun in the sample buffer. (this is a guess). I will investigate this soon,

best regards,

  • dink

dinkc64 avatar Sep 16 '23 05:09 dinkc64

EDIT: this issue & talk below is completely unrelated to the PR -last edit- The issue & TODO is looking right at me in snes.c: snes_runFrame()... "// TODO: improve handling of dma's that take up entire vblank / frame"

Going to try to figure this one out, .. or at least try!

best regards,

  • dink

EDIT2: I believe the issue is in dma_doDma() - if I disable this when the music goes bad, the cycles stay normal and the music plays fine. hmm.... Maybe it's transferring too much per frame?

a little investigation into the sound issue of "Tales of Phantasia". I noticed that the cycles per frame count go absolutely berzerk 78 frames into bootup for a few frames, then stays normal until the music+voice starts playing when you start a game. from bootup: apu / cpu cycles ran: 17088 357362 ... normal! apu / cpu cycles ran: 17088 357368 apu / cpu cycles ran: 17088 357376 ... cut ... apu / cpu cycles ran: 17089 357378 apu / cpu cycles ran: 17086 357340 apu / cpu cycles ran: 17094 357366 apu / cpu cycles ran: 19836 414930 frame 78, starts to get abnormal apu / cpu cycles ran: 25856 540760 .. even more abnormal apu / cpu cycles ran: 23601 493572 apu / cpu cycles ran: 55338 1157266 "1157266" cycles for frame on the main cpu? yikes! (It's almost 70mhz on a single frame!) apu / cpu cycles ran: 23635 494340 apu / cpu cycles ran: 25889 541302 apu / cpu cycles ran: 13807 288880 apu / cpu cycles ran: 17087 357342 ... back to normal... apu / cpu cycles ran: 17092 357362 apu / cpu cycles ran: 17086 357374

Here's a log from when the game starts: apu / cpu cycles ran: 17090 357370 .. just the voice is playing here, no problem apu / cpu cycles ran: 17086 357360 apu / cpu cycles ran: 17088 357366 apu / cpu cycles ran: 17089 357368 apu / cpu cycles ran: 17086 357362 apu / cpu cycles ran: 17089 357376 apu / cpu cycles ran: 17087 357358 apu / cpu cycles ran: 17088 357368 apu / cpu cycles ran: 17088 357370 apu / cpu cycles ran: 17088 357368 apu / cpu cycles ran: 17088 357358 apu / cpu cycles ran: 17254 360828 .. the game-screen appears, the music starts playing - it gets "warbley"/bad sounding. apu / cpu cycles ran: 17085 357328 apu / cpu cycles ran: 16924 353934 .. notice the cycles oscillate between over-shooting and under-shooting? apu / cpu cycles ran: 17253 360820 apu / cpu cycles ran: 16923 353916 apu / cpu cycles ran: 17254 360828 apu / cpu cycles ran: 16923 353926 apu / cpu cycles ran: 17149 358636 apu / cpu cycles ran: 17026 356078 apu / cpu cycles ran: 17255 360744 apu / cpu cycles ran: 16922 354006 apu / cpu cycles ran: 17551 367058 apu / cpu cycles ran: 16624 347656 apu / cpu cycles ran: 17555 367048 apu / cpu cycles ran: 16621 347696 apu / cpu cycles ran: 17552 367074 apu / cpu cycles ran: 16623 347640 apu / cpu cycles ran: 17558 367072 apu / cpu cycles ran: 16618 347658 apu / cpu cycles ran: 17554 367094 apu / cpu cycles ran: 16624 347642 apu / cpu cycles ran: 17549 367052 apu / cpu cycles ran: 16625 347676

I'm not too familiar with snes yet, so there is the possibility that this is normal(?) What are your thoughts?

best regards,

  • dink

dinkc64 avatar Sep 16 '23 06:09 dinkc64

to sum it up, the new sound-problem in "Tales of Phantasia" is caused by dma going past the end of frame. I have some ideas on how to deal with this, will do some experimentation tonight!

best regards,

  • dink

dinkc64 avatar Sep 17 '23 01:09 dinkc64

@angelo-wf, been doing tests over the past weeks and this PR is very solid. The issue with "Tales of Phantasia" is due to the dma-frame-boundary issue. (I shouldn't have mentioned it in this PR, but I was unsure at the time...)

On to some really great news - my cx4 processor core is up and running, passes all of the tests, and runs Megaman X2/X3! After a bit of clean-up I'll make a pr for it.

best regards,

  • dink

dinkc64 avatar Oct 04 '23 00:10 dinkc64

I want to look through this PR soon, I just haven't had much time for it lately. Nice that you got the CX4 working, looking forward to seeing how you implemented it.

angelo-wf avatar Oct 05 '23 20:10 angelo-wf

Hi, Just a little update on my cx4 core, so far things are looking good - I've spent the past week or so polishing it up, but there are a few caveats, here's a list...

simple/non-issue type things ~~ 1: the core code (cx4.cpp) uses tab=4space, because my editor will not play nicely with tab=2spc, (it's something I'd been using since the mid 90's, go figure) 2: it's single context only 3: probably should playtest both games to the end :)

slightly difficult thing ~~ running the cx4 at its rated 20mhz, and the attract mode gameplay demo desychs (megaman gets killed, instead of megaman kills the boss). though, running at half-speed (10mhz) causes everything to align perfectly. I spent a solid week just trying to get to the bottom of this one, but, for now it eludes me. I'll keep trying, though!

I would really like to do a playtest and try to get the timing 100% before making a PR, but if you would like to give it a try before then, I could e-mail over the files, or something like that :)

best regards,

  • dink

dinkc64 avatar Oct 11 '23 04:10 dinkc64

Hi there, a little update on the CX4-core project: ...just fixing up some bugs found while play-testing, also took a short break to work on other things (hint, see new PR). So far, things are looking good!

best regards,

  • dink

dinkc64 avatar Oct 25 '23 01:10 dinkc64

Update: While playtesting X2, found 3 gfx-related bugs: 1: A flicker on "You Got ...." scene (the screen after defeating boss), was caused by a slight timing inconsistency in my cx4 core 2: ? (I forgot the bug), due to unimpl. h/v latch in ppu, very easy to implement/fix. snes.cpp 0x4201 & ppu.cpp 0x37 (ppu_read) 3: in Bubble Crab's stage, the fish ship's radar beam would flicker (on/off every other frame), finding the cause of this one was quite soul-crushing - but - it did allow me to get proper acquainted with the snes ppu! Basically, the fix involves cancelling the halfColor shift if the clipmode == 1 or 2 logic hits (in ppu_handlePixel)

best regards,

  • dink

dinkc64 avatar Nov 04 '23 05:11 dinkc64

@angelo-wf going forward, I will only be committing to my fork of your project, I recommend closing my pr's here and syncing with my fork to keep up to date.

https://github.com/dinkc64/LakeSnes

best regards,

  • dink

dinkc64 avatar Nov 04 '23 13:11 dinkc64

A bit of awesome news: One thing that kinda bothered me is the insane amount of cpu this project uses. I spent perhaps a solid month attempting to optimize things to no avail, but... finally, a couple of good ideas lead to super results today! cut the cpu usage down from 10-11% of a core (13 being fully maxed, equal to 100%) to 7-9%! fastforward fps from 75-86 frames/per second to 92-104 frames/sec!! (ranges specified, as it fluxuates) that's with the cx4 running (X2) :)

Hopefully I can get back to the playtesting so that I can finally wrap up and commit the CX4 project.

best regards,

  • dink

dinkc64 avatar Nov 08 '23 05:11 dinkc64

Hi @angelo-wf, Hope you're well (and not getting tired of my messages.. :) Using a profiler helped to find some hotspots in LakeSnes, a few stood out, like: snes_runCycle(), ppu_handlePixel() and ppu_getPixelForBgLayer(), so as an experiment I did a little bit of caching of values and also implemented a latch in snes_runCycle to reduce the amount of logic to a minimum. So far the results are great, cpu usage is way down now. I'm curious what you think of this block of code: I think the timing should be exactly the same with this (moving hPos+=2 to the top)..?

note: nextHoriEvent an int which gets set to 16 on reset, (its always 16 at the end-of-frame)


static void snes_runCycle(Snes* snes) {
  snes->cycles += 2;
  // increment position
  snes->hPos += 2;
  // check for h/v timer irq's
  bool condition = (
    (snes->vIrqEnabled || snes->hIrqEnabled) &&
    (snes->vPos == snes->vTimer || !snes->vIrqEnabled) &&
    (snes->hPos == snes->hTimer * 4 || !snes->hIrqEnabled)
  );
  if(!snes->irqCondition && condition) {
    snes->inIrq = true;
    cpu_setIrq(snes->cpu, true);
  }
  snes->irqCondition = condition;
  // handle positional stuff
  if (snes->hPos == nextHoriEvent) {
    switch (snes->hPos) {
      case 16: {
        nextHoriEvent = 512;
        if(snes->vPos == 0) snes->dma->hdmaInitRequested = true;
      } break;
      case 512: {
        nextHoriEvent = 1104;
        // render the line halfway of the screen for better compatibility
        if(!snes->inVblank && snes->vPos > 0) ppu_runLine(snes->vPos);
      } break;
      case 1104: {
        if(!snes->inVblank) snes->dma->hdmaRunRequested = true;
        if(!snes->palTiming) {
          // line 240 of odd frame with no interlace is 4 cycles shorter
          // if((snes->hPos == 1360 && snes->vPos == 240 && !ppu_evenFrame() && !ppu_frameInterlace()) || snes->hPos == 1364) {
          nextHoriEvent = (snes->vPos == 240 && !ppu_evenFrame() && !ppu_frameInterlace()) ? 1360 : 1364;
        } else {
          // line 311 of odd frame with interlace is 4 cycles longer
          // if((snes->hPos == 1364 && (snes->vPos != 311 || ppu_evenFrame() || !ppu_frameInterlace())) || snes->hPos == 1368)
          nextHoriEvent = (snes->vPos != 311 || ppu_evenFrame() || !ppu_frameInterlace()) ? 1364 : 1368;
        }
      } break;
      case 1360:
      case 1364:
      case 1368: { // this is the end (of the h-line)
        nextHoriEvent = 16;

        snes->hPos = 0;
        snes->vPos++;
        if(!snes->palTiming) {
          // even interlace frame is 263 lines
          if((snes->vPos == 262 && (!ppu_frameInterlace() || !ppu_evenFrame())) || snes->vPos == 263) {
            if (snes->cart->type == 4) cx4_run();
            snes->vPos = 0;
            snes->frames++;
          }
	    } else {
          // even interlace frame is 313 lines
          if((snes->vPos == 312 && (!ppu_frameInterlace() || !ppu_evenFrame())) || snes->vPos == 313) {
            snes->vPos = 0;
            snes->frames++;
          }
        }

        // end of hblank, do most vPos-tests
        bool startingVblank = false;
        if(snes->vPos == 0) {
          // end of vblank
          snes->inVblank = false;
          snes->inNmi = false;
          ppu_handleFrameStart();
        } else if(snes->vPos == 225) {
          // ask the ppu if we start vblank now or at vPos 240 (overscan)
          startingVblank = !ppu_checkOverscan();
        } else if(snes->vPos == 240){
          // if we are not yet in vblank, we had an overscan frame, set startingVblank
          if(!snes->inVblank) startingVblank = true;
        }
        if(startingVblank) {
          // if we are starting vblank
          ppu_handleVblank();
          snes->inVblank = true;
          snes->inNmi = true;
          if(snes->autoJoyRead) {
            // TODO: this starts a little after start of vblank
            snes->autoJoyTimer = 4224;
            snes_doAutoJoypad(snes);
          }
          if(snes->nmiEnabled) {
            cpu_nmi(snes->cpu);
          }
        }
      } break;
    }
  }
  // handle autoJoyRead-timer
  if(snes->autoJoyTimer > 0) snes->autoJoyTimer -= 2;
}

dinkc64 avatar Nov 11 '23 14:11 dinkc64

stats with the latest on my fork - https://github.com/dinkc64/LakeSnes game: Megaman X2 w/cx4 processor fps values are with fast-forward (no frame limiting) before optimizations: 76 - 94fps after optimizations: 106 - 133fps after optimizations & not drawing skipped frames**: 208 - 276fps after optimizations + de-contexting ppu.c*: 118 - 153fps after optimizations + de-contexting ppu.c & not drawing skipped frames**: 217 - 320fps success! ** in ppu_runLine(), return after "if(!forcedBlank) ppu_evaluateSprites(line - 1);"

  • de-contexting ppu.c involves moving the entire ppu structure into the ppu.c's global space as static variables.

  • and **, being experimental are not (yet) on git.

best regards,

  • dink

dinkc64 avatar Nov 14 '23 05:11 dinkc64

CX4 has been added, yay! https://github.com/dinkc64/LakeSnes/commit/10f4a9dcc36d173057df2b01bf1a2dbb71c08abe

So far I've played through: Wheel Gator, Overdrive Ostrich, Bubble Crab levels, and fixed all bugs in both the emu & my cx4 core found along the way. I can't guarantee the entire game just yet!

best regards,

  • dink

dinkc64 avatar Nov 14 '23 15:11 dinkc64