riscv-perf-model
riscv-perf-model copied to clipboard
Multiple LSU pipeline support
Hi Arup and Knute, after I talked with Arup, I decided to implement store buffer first to enable data forwarding and then implement multi pipelines. Here is the draft pull. For my implementation, 30% tests passed, 69 tests failed out of 99, when I look those results I feel really lost on how to begin debug. Could you give me some guidance? Thank you!
@MayneMei Please reach out to Knute and see if he is available for a meeting to address your questions about debugging.
@MayneMei I fixed the macOS build.
BTW I am going to release my VLSU branch, which may conflict with some of your LSU changes. Let me know if you have any trouble updating your branch.
When I added print message inside the LSU.cpp, I can see the store buffer size was modified. But in the test it's always 0. However, the cycle time met the expectation of data forwarding, so I assume it's correct
@MayneMei my recent PR introduces some merge conflicts for you. If you have any trouble with the merge, please let me know and I can help resolve it.
@kathlenemagnus I resolved conflict. However during the regression test, there is an environment issue related to MacOS, and also test "BranchPred_test_Run" introduced a segfault. Other tests are all passed. Do you mind take a look at it? Thank you!
@MayneMei it looks like you rebased your branch incorrectly. The PR contains all of the fixes that are already in master -- perhaps you rebased all of the updates since you pulled last?
@klingaard I reseted the commit and rebased it.
Thanks for the suggestion! I should have a following PR in next week.
Knute Lingaard @.***>于2025年5月26日 周一12:37写道:
@.**** approved this pull request.
Thanks for making the proper fix. We can move forward with this, but I do have a suggestion for improving the performance/readability of it. Feel free to merge and make a subsequent PR to change the alg if you want.
In core/lsu/LSU.cpp https://github.com/riscv-software-src/riscv-perf-model/pull/216#discussion_r2107781558 :
std::vector<bool> coverage_mask(load_size, false);
uint32_t bytes_covered_count = 0;// Iterate through the store_buffer_ from youngest to oldest.// This ensures that if multiple stores write to the same byte,// the data from the youngest store is effectively used.for (auto it = store_buffer_.rbegin(); it != store_buffer_.rend(); ++it){const auto& store_info_ptr = *it; // LoadStoreInstInfoPtrconst InstPtr& store_inst_ptr = store_info_ptr->getInstPtr();const uint64_t store_addr = store_inst_ptr->getTargetVAddr();const uint32_t store_size = store_inst_ptr->getMemAccessSize();if (store_size == 0) {continue; // Skip stores that don't actually write data.Which ones are those?
In core/lsu/LSU.cpp https://github.com/riscv-software-src/riscv-perf-model/pull/216#discussion_r2107793832 :
store_buffer_.push_back(store_info_ptr);ILOG("Store added to store buffer: " << inst_ptr);- }
- bool LSU::tryStoreToLoadForwarding(const InstPtr& load_inst_ptr) const
- {
const uint64_t load_addr = load_inst_ptr->getTargetVAddr();const uint32_t load_size = load_inst_ptr->getMemAccessSize();// A load must have a non-zero size to access memory.if (load_size == 0) {return false;}std::vector<bool> coverage_mask(load_size, false);This defaults to a specialization of std::vector which is really a dynamic std::bitset. Suggest using a boost::dynamic_bitset for the implementation. See my next points...
Since we don't have a concept of a store buffer class, which could maintain the valid bits per cache line, you can kinda simulate that here.
First, you can make a rule that store to load forwarding cannot cross a cache line. Makes things a little easier.
If the load addr + size is good create a bitmask of the bytes that are needed by the load based on the size of the cache line
Iterate the store buffer like you're doing
Create a bitmask for the store (like you did for the load) and clear the bits in the load bit mask that match:
load_bytes_needed &= ~store_bytes; if (load_bytes_needed.none()) { return true; // we found all the stores that can forward to this load within the cache line }
— Reply to this email directly, view it on GitHub https://github.com/riscv-software-src/riscv-perf-model/pull/216#pullrequestreview-2869063971, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSAIYBUXLDB7WPTNGHL3M33ANUPLAVCNFSM6AAAAABRAUMZJKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDQNRZGA3DGOJXGE . You are receiving this because you were mentioned.Message ID: @.*** com>
@MayneMei do you have any updates to push?
We'll go ahead and merge this. There are a few folks comping at the bit to enhance this block and are waiting for this PR.