gem5
gem5 copied to clipboard
CLFLUSH causes TLB assertion error
Describe the bug
Executing clflush
instruction in FS mode caused this assertion error,
src/sim/simulate.cc:199: info: Entering event queue @ 14392355269750. Starting simulation...
src/sim/simulate.cc:199: info: Entering event queue @ 15473386459500. Starting simulation...
src/sim/power_state.cc:105: warn: PowerState: Already in the requested power state, request ignored
src/sim/simulate.cc:199: info: Entering event queue @ 15474400558750. Starting simulation...
src/mem/ruby/system/Sequencer.cc:668: warn: Replacement policy updates recently became the responsibility of SLICC state machines. Make sure to setMRU() near callba
cks in .sm files!
gem5.opt: src/cpu/translation.hh:257: void gem5::DataTranslation<ExecContextPtr>::finish(const Fault&, const RequestPtr&, gem5::ThreadContext*, gem5::BaseMMU::Mode)
[with ExecContextPtr = gem5::TimingSimpleCPU*; gem5::Fault = std::shared_ptr<gem5::FaultBase>; gem5::RequestPtr = std::shared_ptr<gem5::Request>]: Assertion `mode
== state->mode' failed.
Program aborted at tick 15478398983501
The problem seems to be that,
- In the gem5's X86 ISA implementation,
clflush*
is a store instruction. - In TimingSimpleCPU, the translation of a store instruction will set the TLB to
BaseMMU::Write
mode. - While translation is happening, the state of the translation is changed to
BaseMMU::Read
as the mem request caused byclflush
is aisCacheClean
request. https://github.com/gem5/gem5/blob/v23.1.0.0/src/arch/x86/tlb.cc#L556-L564 - This resulted in a state mismatch, and subsequently caused the mention assertion error, which requires the state of the TLB and the state of translation to be the same.
Affects version v23.1
gem5 Modifications N/A
Additional information Related issue: #226 Related change: #592
Reverting #592 fixes the error on my side, but it's not ideal due to issue #226.
I think we can make the clflush*
instructions to be load instructions, but I'm not sure if that change is functionally correct.
A possible fix is to change line 257 to:
// CLFLUSHOPT/WB/FLUSH is treated as read for protection checks but modelled as write
assert(mode == state->mode || req->isCacheClean());
https://github.com/gem5/gem5/blob/c890e6b113a59caeec3eea9bafea4a03a471f89d/src/cpu/translation.hh#L257
Thanks for the fix @Lukas-Zenick. I wonder if this error only occurs for the timing CPU or if it also happens when using the O3 CPU?
If it's only the timing CPU, then maybe we should update the TimingSimpleCPU::writeMem
function. If it happens for all CPU models, then I think your fix in #1080 is correct.