Assertion when prefetcher is turned on
I am getting a similar issue as in "https://github.com/s5z/zsim/issues/34". The issue occurs with prefetcher between l1d and l2 and using mem type MD1. Without the prefetcher the error is not there. I would appreciate your comments on fixing this issue. Here is the error:
| Running on 24 Cores... [S 0] Thread 28 starting [S 0] Thread 29 starting [S 0] Thread 30 starting [S 0] Failed assertion on build/opt/coherence_ctrls.cpp:109 '*state == S || *state == E' (with '0 == 1 || 0') [S 0] [28] Internal exception detected: [S 0] [28] Code: 1 [S 0] [28] Address: 0x7ffff7195727 [S 0] [28] Description: Exception Code: ACCESS_INVALID_ADDRESS. Exception Address = 0x7ffff7195727. Access Type: UNKNOWN. Access Address = 0x000000000 [S 0] [28] Caused by invalid access to address 0x0
Here is the backtrace: [S 0] [28] Backtrace (13/40 max frames) [S 0] [28] /file0/Monolithic_3D_Work/Architecture/latest_zsim/zsim_Centos6/zsim/build/opt/zsim.cpp:1401 / InternalExceptionHandler(unsigned int, LEVEL_BASE::EXCEPTION_INFO_, LEVEL_VM::PHYSICAL_CONTEXT_, void_) [S 0] [28] sha1.c:0 / LEVEL_PINCLIENT::IEH_CALLBACKS::NotifyInternalException(unsigned int, LEVEL_BASE::EXCEPTION_INFO_, LEVEL_VM::CONTEXT_) [S 0] [28] /rsghome/sadegh/pin-2.10-45467-gcc.3.4.6-ia32_intel64-linux/intel64/bin/pinbin(_ZN8LEVEL_VM12SIGNALS_IMPL21HandleExceptionInToolEPNS_5PCTXTEPN10LEVEL_BASE14EXCEPTION_INFOE+0x1e8) [0x30753b58] [S 0] [28] /rsghome/sadegh/pin-2.10-45467-gcc.3.4.6-ia32_intel64-linux/intel64/bin/pinbin(_ZN8LEVEL_VM12SIGNALS_IMPL19InternalHandlerSyncEiPN7BARECRT8SIGXINFOEPN5PINVM11ISIGCONTEXTEPPKNS_14SCT_ATTRIBUTESEPNS_5PCTXTE+0x33f) [0x3076686f] [S 0] [28] /rsghome/sadegh/pin-2.10-45467-gcc.3.4.6-ia32_intel64-linux/intel64/bin/pinbin(_ZN8LEVEL_VM12SIGNALS_IMPL20HandlePhysicalSignalEPN7BARECRT8SIGXINFOEPN5PINVM11ISIGCONTEXTE+0x136) [0x30767816] [S 0] [28] /rsghome/sadegh/pin-2.10-45467-gcc.3.4.6-ia32_intel64-linux/intel64/bin/pinbin(ZN5PINVM28SIGNAL_DETAILS_LINUX_INTEL6415InternalHandlerEiPN7BARECRT8SIGXINFOEPv+0xa4) [0x3080bdd4] [S 0] [28] /rsghome/sadegh/pin-2.10-45467-gcc.3.4.6-ia32_intel64-linux/intel64/bin/pinbin(BARECRT_SigReturnRt+0) [0x3083b910] [S 0] [28] /file0/Monolithic_3D_Work/Architecture/latest_zsim/zsim_Centos6/zsim/build/opt/coherence_ctrls.cpp:139 / MESIBottomCC::processAccess(unsigned long, unsigned int, AccessType, unsigned long, unsigned int, unsigned int) [S 0] [28] /file0/Monolithic_3D_Work/Architecture/latest_zsim/zsim_Centos6/zsim/build/opt/coherence_ctrls.h:471 / MESITerminalCC::processAccess(MemReq const&, int, unsigned long, unsigned long) [S 0] [28] /file0/Monolithic_3D_Work/Architecture/latest_zsim/zsim_Centos6/zsim/build/opt/cache.cpp:78 / Cache::access(MemReq&) [S 0] [28] /file0/Monolithic_3D_Work/Architecture/latest_zsim/zsim_Centos6/zsim/build/opt/filter_cache.h:186 / FilterCache::replace(unsigned long, unsigned int, bool, unsigned long) [S 0] [28] /file0/Monolithic_3D_Work/Architecture/latest_zsim/zsim_Centos6/zsim/build/opt/filter_cache.h:150 / load [S 0] [28] [0x7fffe4ed63fb] C:Tool (or Pin) caused signal 11 at PC 0x7ffff7195727 [H] Child 20918 done [H] Panic on build/opt/zsim_harness.cpp:118: Child 20918 (idx 0) exit was anomalous, killing simulation
Here is my cfg file snippet: caches = {
l1i = {
array = {
type = "SetAssoc";
ways = 4;
};
caches = 4;
latency = 1;
parent = "l2";
size = 32768; # 32KB
};
l1d = {
array = {
type = "SetAssoc";
ways = 8;
};
caches = 4;
latency = 2;
parent = "l2prefetcher";
size = 32768; # 32KB
};
l2prefetcher = {
isPrefetcher = true;
parent = "l2";
prefetchers = 4;
};
l2 = {
array = {
type = "SetAssoc";
ways = 8;
};
caches = 1;
banks = 1;
latency = 2;
Wrlatency = 26;
parent = "mem";
repl = {
type = "LRUNoSh";
};
size = 268435456; # 256KB
};
};
frequency = 6000;
lineSize = 64;
mem = {
controllers = 1024;
type = "MD1";
latency= 16;
wrLatency= 69;
bandwidth= 32768;
};
Hi
Couple of weeks ago, when i wrote this post #119 , I've made some changes to the prefetcher code to work with the weave model.
So far, that's the reason you always hit this assertion with "Access Address = 0x000000000" it is because the "new petitions" creates and modify timing records without an TimingEvent to handle them in the weave phase. I don't remember explicitly if this was because in the contention, there is a TimingRecord with a StartEvent with value of NULL . ( which seems so because of the log you post ) , or there are other assert, that someone in the chain does evRec->popRecord(), and left nothing to the contention to operate with.
Again, that's the main reason I was asking #119 for a better understanding on the usage of the TimingEvents, and TimingRecord, structures. And actually the overall weave phase. There are already good questions and answers i.e. #53 in the forum, but still there is some room for clearing and documenting.
I'll made the patch presentable and I'll send it over.
Ok. The branch that contains the patch is here: [https://github.com/rommelsv/zsim/tree/initial-pf],
That branch have actually two new features: one variable to control the size when the HDF5 is about to write to the disk, and the number of entries you want the prefetcher to handle.
There is also a sample file that indicates how to use it.
Remember, this is just a patch to make the prefetcher work with the weave models. Sill have some things to debug, to properly have it working. Also, keep it in mind that comments are OK: this is the DCU version for the Westmere architecture. So any other update wil be very welcome in the sense of extending the prefetcher itself. for L1-L2 but also for upper levels.
Thanks rommelsv for posting the patch. I tried it but am getting the following compile errors while building zsim:
build/opt/virt/patchdefs.h: In function ‘void VirtInit()’: build/opt/virt/patchdefs.h:41:4: error: ‘SYS_getcpu’ was not declared in this scope PF(SYS_getcpu, PatchGetcpu); ^ build/opt/virt/virt.cpp:68:48: note: in definition of macro ‘PF’ #define PF(syscall, pfn) prePatchFunctions[syscall] = pfn; ^ scons: *** [build/opt/virt/virt.os] Error 1 build/opt/init.cpp: In function ‘CacheGroup* BuildCacheGroup(Config&, const stri ng&, bool)’: build/opt/init.cpp:406:14: error: redeclaration of ‘uint32_t size’ uint32_t size = config.get<uint32_t>(prefix + "size", 64_1024); ^ build/opt/init.cpp:381:14: error: ‘uint32_t size’ previously declared here uint32_t size = config.get<uint32_t>(prefix + "size", 64_1024); ^ build/opt/init.cpp:407:14: error: redeclaration of ‘uint32_t banks’ uint32_t banks = config.get<uint32_t>(prefix + "banks", 1); ^ build/opt/init.cpp:382:14: error: ‘uint32_t banks’ previously declared here uint32_t banks = config.get<uint32_t>(prefix + "banks", 1); ^ build/opt/init.cpp:408:14: error: redeclaration of ‘uint32_t caches’ uint32_t caches = config.get<uint32_t>(prefix + "caches", 1); ^ build/opt/init.cpp:383:14: error: ‘uint32_t caches’ previously declared here uint32_t caches = config.get<uint32_t>(prefix + "caches", 1); ^ build/opt/init.cpp:410:14: error: redeclaration of ‘uint32_t bankSize’ uint32_t bankSize = size/banks; ^ build/opt/init.cpp:387:14: error: ‘uint32_t bankSize’ previously declared here uint32_t bankSize = size/banks; ^ scons: *** [build/opt/init.os] Error 1 scons: building terminated because of errors.
Hey. githubchik. sorry, my bad on one of those errors, the init.cpp Seems that I got confused when I was syncing with the latest zsim version. generally I'm working with an old one. Just for homogenization with other tests. I pushed one more commit. so it might be "ready" The other one, well look at #1 Let me know.
Hi @rommelsv I am now using your patch for doing prefetching in zsim. It was a success with your simple-pf.cfg. But when I tried some other benchmarks(the Galois graph framework), it failed with a similar question.(accessing invalid address). The detailed report is like this.
[S 0] pfc-0: pos 58 stride 1 conf 2 lastPrefetchPos 56 prefetchPos 59 fetchDepth 1 [S 0] [0] Internal exception detected: [S 0] [0] Code: 1 [S 0] [0] Address: 0x7ffff632dfb7 [S 0] [0] Description: Exception Code: ACCESS_INVALID_ADDRESS. Exception Address = 0x7ffff632dfb7. Access Type: UNKNOWN. Access Address = 0x000000008 [S 0] [0] Caused by invalid access to address 0x8 [S 0] [0] Backtrace (12/40 max frames) [S 0] [0] /home/chao/git_repos/rommelsv-zsim/build/opt/zsim.cpp:1392 / InternalExceptionHandler [S 0] [0] :? / LEVEL_PINCLIENT::IEH_CALLBACKS::NotifyInternalException(unsigned int, LEVEL_BASE::EXCEPTION_INFO*, LEVEL_VM::CONTEXT*) [S 0] [0] /home/chao/git_repos/pin/intel64/bin/pinbin(_ZN8LEVEL_VM12SIGNALS_IMPL19InternalHandlerSyncEiPN7BARECRT8SIGXINFOEPN5PINVM11ISIGCONTEXTEPPKNS_14SCT_ATTRIBUTESEPNS_5PCTXTEPj+0x444) [0x3043a9454] [S 0] [0] /home/chao/git_repos/pin/intel64/bin/pinbin(_ZN8LEVEL_VM12SIGNALS_IMPL20HandlePhysicalSignalEPN7BARECRT8SIGXINFOEPN5PINVM11ISIGCONTEXTE+0x124) [0x3043aa1f4] [S 0] [0] /home/chao/git_repos/pin/intel64/bin/pinbin(_ZN5PINVM28SIGNAL_DETAILS_LINUX_INTEL6415InternalHandlerEiPN7BARECRT8SIGXINFOEPv+0xe8) [0x304438c88] [S 0] [0] /home/chao/git_repos/pin/intel64/bin/pinbin(BARECRT_SigReturnRt+0) [0x30446603c] [S 0] [0] /home/chao/git_repos/rommelsv-zsim/build/opt/slab_alloc.h:108 / slab::SlabAlloc::alloc(unsigned long) [S 0] [0] /home/chao/git_repos/rommelsv-zsim/build/opt/coherence_ctrls.cpp:109 / MESIBottomCC::processAccess(unsigned long, unsigned int, AccessType, unsigned long, unsigned int, unsigned int) [S 0] [0] /home/chao/git_repos/rommelsv-zsim/build/opt/coherence_ctrls.h:472 / MESITerminalCC::processAccess(MemReq const&, int, unsigned long, unsigned long*) [S 0] [0] /home/chao/git_repos/rommelsv-zsim/build/opt/cache.cpp:94 / Cache::access(MemReq&) [S 0] [0] /home/chao/git_repos/rommelsv-zsim/build/opt/filter_cache.h:138 / FilterCache::replace(unsigned long, unsigned int, bool, unsigned long) [S 0] [0] [0x7fffe3afc972] C: Tool (or Pin) caused signal 11 at PC 0x7ffff632dfb7 [H] Child 368419 done [H] Panic on build/opt/zsim_harness.cpp:123: Child 368419 (idx 0) exit was anomalous, killing simulation
Here is the config file I used.
sys = { cores = { simpleCore = { type = "Simple"; cores = 8; dcache = "l1d"; icache = "l1i"; }; };
lineSize = 64;
caches = {
l1d = {
caches = 8;
size = 32768;
};
l1i = {
caches = 8;
size = 32768;
};
pfc = {
isPrefetcher = True;
prefetchers = 8;
children = "l1d"
};
l2 = {
caches = 8;
size = 262144;
children = "l1i|pfc"; // interleave
};
l3 = {
caches = 1;
size = 262144;
children = "l2";
};
};
};
sim = { phaseLength = 10000; // attachDebugger = True; schedQuantum = 100; // switch threads frequently procStatsFilter = "l1.|l2."; };
process0 = { command = "/home/chao/git_repos/Galois/build/debug/apps/bfs/bfs -algo=async /home/chao/git_repos/Galois/inputs/structured/rome99.gr -t 8"
command = "/home/chao/git_repos/zsim2/misc/ven_api/test3_instr"
command = "ls -l"
};
=============================== Could you give me some hint on how to fix such problems? I really appreciate your help.
Hi @rommelsv , I found that the very similar problem (assertion failure) arise when I simulate multiple cores in the config file. So could you please provide a working config files with multiple cores simulated when using zsim prefetcher?