SHARC: verify timing bottlenecks
There is something that can be made tho: SHARC itself currently emulates every single opcode with -1 icount, everywhere. Trying to halve the DSP clock to half has no effective effect to render framerate while recovering a lot of performance back. There are two things that can be done:
- check the documentation for DSP timings (probably just heavily pipelined or just not provided);
- apply/check waitstates to anything that's external (including Voodoo accesses in this case);
cc. @gm-matthew and @Nitch2024 because this applies to Model 2B as well.
Originally posted by @angelosa in #14225
https://www.analog.com/en/lp/001/sharc-manuals.html
See “SHARC® Processor Programming Reference (Includes ADSP-2136x, ADSP-2137x, and ADSP-214xx Processors) (Rev. 2.4)”
It’s a five-stage interlocked pipeline. Maximum throughput is one instruction per clock, but its slowed down by lots of things, e.g. an instruction cache miss is detected and handled at the Decode stage.