LakeSnes
LakeSnes copied to clipboard
fix spc upload problems
Hi @angelo-wf, This pr fixes all of the spc transfer issues that I know of, the theory of why this is necessary is simple: enix game jams data into the spc as fast as it possibly can, and running a single opcode at a time causes the following problem, example... snes_catchupApu() needs to run 2 cycles, but the spc runs a whole opcode - of which would be 6 cycles long (0xfe). The next spc-opcode is a mov from the port. So, now the next write happens from cpu -> spc, and catching up apu will not run the spc at all since we are in debt by 4 cycles. Now the spc "misses" the new data, and instead has the old data loaded from the port... uhoh :)
There is a new problem with "Tales of Phantasia". The samples are no longer corrupted-sounding when they play individually, but when the game starts playing the music and the voice samples at the same time (during the fight) - everything gets garbled. It sounds like there is a possible overrun in the sample buffer. (this is a guess). I will investigate this soon,
best regards,
- dink
EDIT: this issue & talk below is completely unrelated to the PR -last edit- The issue & TODO is looking right at me in snes.c: snes_runFrame()... "// TODO: improve handling of dma's that take up entire vblank / frame"
Going to try to figure this one out, .. or at least try!
best regards,
- dink
EDIT2: I believe the issue is in dma_doDma() - if I disable this when the music goes bad, the cycles stay normal and the music plays fine. hmm.... Maybe it's transferring too much per frame?
a little investigation into the sound issue of "Tales of Phantasia". I noticed that the cycles per frame count go absolutely berzerk 78 frames into bootup for a few frames, then stays normal until the music+voice starts playing when you start a game. from bootup: apu / cpu cycles ran: 17088 357362 ... normal! apu / cpu cycles ran: 17088 357368 apu / cpu cycles ran: 17088 357376 ... cut ... apu / cpu cycles ran: 17089 357378 apu / cpu cycles ran: 17086 357340 apu / cpu cycles ran: 17094 357366 apu / cpu cycles ran: 19836 414930 frame 78, starts to get abnormal apu / cpu cycles ran: 25856 540760 .. even more abnormal apu / cpu cycles ran: 23601 493572 apu / cpu cycles ran: 55338 1157266 "1157266" cycles for frame on the main cpu? yikes! (It's almost 70mhz on a single frame!) apu / cpu cycles ran: 23635 494340 apu / cpu cycles ran: 25889 541302 apu / cpu cycles ran: 13807 288880 apu / cpu cycles ran: 17087 357342 ... back to normal... apu / cpu cycles ran: 17092 357362 apu / cpu cycles ran: 17086 357374
Here's a log from when the game starts: apu / cpu cycles ran: 17090 357370 .. just the voice is playing here, no problem apu / cpu cycles ran: 17086 357360 apu / cpu cycles ran: 17088 357366 apu / cpu cycles ran: 17089 357368 apu / cpu cycles ran: 17086 357362 apu / cpu cycles ran: 17089 357376 apu / cpu cycles ran: 17087 357358 apu / cpu cycles ran: 17088 357368 apu / cpu cycles ran: 17088 357370 apu / cpu cycles ran: 17088 357368 apu / cpu cycles ran: 17088 357358 apu / cpu cycles ran: 17254 360828 .. the game-screen appears, the music starts playing - it gets "warbley"/bad sounding. apu / cpu cycles ran: 17085 357328 apu / cpu cycles ran: 16924 353934 .. notice the cycles oscillate between over-shooting and under-shooting? apu / cpu cycles ran: 17253 360820 apu / cpu cycles ran: 16923 353916 apu / cpu cycles ran: 17254 360828 apu / cpu cycles ran: 16923 353926 apu / cpu cycles ran: 17149 358636 apu / cpu cycles ran: 17026 356078 apu / cpu cycles ran: 17255 360744 apu / cpu cycles ran: 16922 354006 apu / cpu cycles ran: 17551 367058 apu / cpu cycles ran: 16624 347656 apu / cpu cycles ran: 17555 367048 apu / cpu cycles ran: 16621 347696 apu / cpu cycles ran: 17552 367074 apu / cpu cycles ran: 16623 347640 apu / cpu cycles ran: 17558 367072 apu / cpu cycles ran: 16618 347658 apu / cpu cycles ran: 17554 367094 apu / cpu cycles ran: 16624 347642 apu / cpu cycles ran: 17549 367052 apu / cpu cycles ran: 16625 347676
I'm not too familiar with snes yet, so there is the possibility that this is normal(?) What are your thoughts?
best regards,
- dink
to sum it up, the new sound-problem in "Tales of Phantasia" is caused by dma going past the end of frame. I have some ideas on how to deal with this, will do some experimentation tonight!
best regards,
- dink
@angelo-wf, been doing tests over the past weeks and this PR is very solid. The issue with "Tales of Phantasia" is due to the dma-frame-boundary issue. (I shouldn't have mentioned it in this PR, but I was unsure at the time...)
On to some really great news - my cx4 processor core is up and running, passes all of the tests, and runs Megaman X2/X3! After a bit of clean-up I'll make a pr for it.
best regards,
- dink
I want to look through this PR soon, I just haven't had much time for it lately. Nice that you got the CX4 working, looking forward to seeing how you implemented it.
Hi, Just a little update on my cx4 core, so far things are looking good - I've spent the past week or so polishing it up, but there are a few caveats, here's a list...
simple/non-issue type things ~~ 1: the core code (cx4.cpp) uses tab=4space, because my editor will not play nicely with tab=2spc, (it's something I'd been using since the mid 90's, go figure) 2: it's single context only 3: probably should playtest both games to the end :)
slightly difficult thing ~~ running the cx4 at its rated 20mhz, and the attract mode gameplay demo desychs (megaman gets killed, instead of megaman kills the boss). though, running at half-speed (10mhz) causes everything to align perfectly. I spent a solid week just trying to get to the bottom of this one, but, for now it eludes me. I'll keep trying, though!
I would really like to do a playtest and try to get the timing 100% before making a PR, but if you would like to give it a try before then, I could e-mail over the files, or something like that :)
best regards,
- dink
Hi there, a little update on the CX4-core project: ...just fixing up some bugs found while play-testing, also took a short break to work on other things (hint, see new PR). So far, things are looking good!
best regards,
- dink
Update: While playtesting X2, found 3 gfx-related bugs: 1: A flicker on "You Got ...." scene (the screen after defeating boss), was caused by a slight timing inconsistency in my cx4 core 2: ? (I forgot the bug), due to unimpl. h/v latch in ppu, very easy to implement/fix. snes.cpp 0x4201 & ppu.cpp 0x37 (ppu_read) 3: in Bubble Crab's stage, the fish ship's radar beam would flicker (on/off every other frame), finding the cause of this one was quite soul-crushing - but - it did allow me to get proper acquainted with the snes ppu! Basically, the fix involves cancelling the halfColor shift if the clipmode == 1 or 2 logic hits (in ppu_handlePixel)
best regards,
- dink
@angelo-wf going forward, I will only be committing to my fork of your project, I recommend closing my pr's here and syncing with my fork to keep up to date.
https://github.com/dinkc64/LakeSnes
best regards,
- dink
A bit of awesome news: One thing that kinda bothered me is the insane amount of cpu this project uses. I spent perhaps a solid month attempting to optimize things to no avail, but... finally, a couple of good ideas lead to super results today! cut the cpu usage down from 10-11% of a core (13 being fully maxed, equal to 100%) to 7-9%! fastforward fps from 75-86 frames/per second to 92-104 frames/sec!! (ranges specified, as it fluxuates) that's with the cx4 running (X2) :)
Hopefully I can get back to the playtesting so that I can finally wrap up and commit the CX4 project.
best regards,
- dink
Hi @angelo-wf, Hope you're well (and not getting tired of my messages.. :) Using a profiler helped to find some hotspots in LakeSnes, a few stood out, like: snes_runCycle(), ppu_handlePixel() and ppu_getPixelForBgLayer(), so as an experiment I did a little bit of caching of values and also implemented a latch in snes_runCycle to reduce the amount of logic to a minimum. So far the results are great, cpu usage is way down now. I'm curious what you think of this block of code: I think the timing should be exactly the same with this (moving hPos+=2 to the top)..?
note: nextHoriEvent an int which gets set to 16 on reset, (its always 16 at the end-of-frame)
static void snes_runCycle(Snes* snes) {
snes->cycles += 2;
// increment position
snes->hPos += 2;
// check for h/v timer irq's
bool condition = (
(snes->vIrqEnabled || snes->hIrqEnabled) &&
(snes->vPos == snes->vTimer || !snes->vIrqEnabled) &&
(snes->hPos == snes->hTimer * 4 || !snes->hIrqEnabled)
);
if(!snes->irqCondition && condition) {
snes->inIrq = true;
cpu_setIrq(snes->cpu, true);
}
snes->irqCondition = condition;
// handle positional stuff
if (snes->hPos == nextHoriEvent) {
switch (snes->hPos) {
case 16: {
nextHoriEvent = 512;
if(snes->vPos == 0) snes->dma->hdmaInitRequested = true;
} break;
case 512: {
nextHoriEvent = 1104;
// render the line halfway of the screen for better compatibility
if(!snes->inVblank && snes->vPos > 0) ppu_runLine(snes->vPos);
} break;
case 1104: {
if(!snes->inVblank) snes->dma->hdmaRunRequested = true;
if(!snes->palTiming) {
// line 240 of odd frame with no interlace is 4 cycles shorter
// if((snes->hPos == 1360 && snes->vPos == 240 && !ppu_evenFrame() && !ppu_frameInterlace()) || snes->hPos == 1364) {
nextHoriEvent = (snes->vPos == 240 && !ppu_evenFrame() && !ppu_frameInterlace()) ? 1360 : 1364;
} else {
// line 311 of odd frame with interlace is 4 cycles longer
// if((snes->hPos == 1364 && (snes->vPos != 311 || ppu_evenFrame() || !ppu_frameInterlace())) || snes->hPos == 1368)
nextHoriEvent = (snes->vPos != 311 || ppu_evenFrame() || !ppu_frameInterlace()) ? 1364 : 1368;
}
} break;
case 1360:
case 1364:
case 1368: { // this is the end (of the h-line)
nextHoriEvent = 16;
snes->hPos = 0;
snes->vPos++;
if(!snes->palTiming) {
// even interlace frame is 263 lines
if((snes->vPos == 262 && (!ppu_frameInterlace() || !ppu_evenFrame())) || snes->vPos == 263) {
if (snes->cart->type == 4) cx4_run();
snes->vPos = 0;
snes->frames++;
}
} else {
// even interlace frame is 313 lines
if((snes->vPos == 312 && (!ppu_frameInterlace() || !ppu_evenFrame())) || snes->vPos == 313) {
snes->vPos = 0;
snes->frames++;
}
}
// end of hblank, do most vPos-tests
bool startingVblank = false;
if(snes->vPos == 0) {
// end of vblank
snes->inVblank = false;
snes->inNmi = false;
ppu_handleFrameStart();
} else if(snes->vPos == 225) {
// ask the ppu if we start vblank now or at vPos 240 (overscan)
startingVblank = !ppu_checkOverscan();
} else if(snes->vPos == 240){
// if we are not yet in vblank, we had an overscan frame, set startingVblank
if(!snes->inVblank) startingVblank = true;
}
if(startingVblank) {
// if we are starting vblank
ppu_handleVblank();
snes->inVblank = true;
snes->inNmi = true;
if(snes->autoJoyRead) {
// TODO: this starts a little after start of vblank
snes->autoJoyTimer = 4224;
snes_doAutoJoypad(snes);
}
if(snes->nmiEnabled) {
cpu_nmi(snes->cpu);
}
}
} break;
}
}
// handle autoJoyRead-timer
if(snes->autoJoyTimer > 0) snes->autoJoyTimer -= 2;
}
stats with the latest on my fork - https://github.com/dinkc64/LakeSnes game: Megaman X2 w/cx4 processor fps values are with fast-forward (no frame limiting) before optimizations: 76 - 94fps after optimizations: 106 - 133fps after optimizations & not drawing skipped frames**: 208 - 276fps after optimizations + de-contexting ppu.c*: 118 - 153fps after optimizations + de-contexting ppu.c & not drawing skipped frames**: 217 - 320fps success! ** in ppu_runLine(), return after "if(!forcedBlank) ppu_evaluateSprites(line - 1);"
-
de-contexting ppu.c involves moving the entire ppu structure into the ppu.c's global space as static variables.
-
and **, being experimental are not (yet) on git.
best regards,
- dink
CX4 has been added, yay! https://github.com/dinkc64/LakeSnes/commit/10f4a9dcc36d173057df2b01bf1a2dbb71c08abe
So far I've played through: Wheel Gator, Overdrive Ostrich, Bubble Crab levels, and fixed all bugs in both the emu & my cx4 core found along the way. I can't guarantee the entire game just yet!
best regards,
- dink