coreblocks
coreblocks copied to clipboard
Implement a proper load/store unit
Current implementation of loads/stores is a serious bottleneck of the CPU. It has only one RS slot, which means that if the core fetches two load/store instructions close to each other, the second one stalls the decoding pipeline and therefore blocks possible instruction reordering. Equipping the load/store unit with a proper RS unit requires care because of possible data dependencies. They need to be handled, so that, e.g. loads don't return values stored by later instructions.
To implement this, one should first learn about implementation techniques from the literature. For example:
- Modern Processor Design - Fundamentals of Superscalar Processors, chapter 5.3 - "Memory Data Flow Techniques"
- Microprocessor Architecture - From Simple Pipelines to Chip Multiprocessors, chapter 5.2 - "Memory Accessing Instructions"