Peanut-GB
Peanut-GB copied to clipboard
oam: improve DMA speed
The address of the DMA transfer is always within a specific memory boundary. So if the first address is in WRAM, then the last address is too. We can use this to our advantage by obtaining the initial address to WRAM and incrementally copying data, instead of calling gb_read and gb_write for each copy.
We can assume that the OAM DMA will most likely copy from a shadow OAM area in WRAM to OAM. Since WRAM is managed by Peanut-GB, we can obtain a pointer and use copy the data without using gb_read. If the data is from the Cart ROM or Cart RAM (which is possible), then we cannot use a pointer without an API change.
Since OAM size is 0xA0 (160), we can perform the copy using 32-bit transfers instead of 8-bit, if the running platform uses little endian. This will result in only 40 transfers as opposed to 160.
Partially completed in 428e616c000483a3acb3edaf87bdcd845b153e4c. Reduced number of instructions when compiled for ARM Cortex M0+.
Work done on oam-optimisation branch. Need to benchmark as to whether this improves performance.
There is no noticeable improvement in performance. Benchmark suggests this branch is slightly slower. https://github.com/deltabeard/Peanut-GB/commits/oam-optimisation
Because this resulted in no noticeable performance improvement, this will be closed.