quda icon indicating copy to clipboard operation
quda copied to clipboard

Add robust signal captures to QUDA

Open weinbe2 opened this issue 1 year ago • 4 comments

Per a suggestion from @paboyle , we should add support for proper signal capturing in QUDA. (BSD references are welcome :) )

weinbe2 avatar Mar 25 '24 20:03 weinbe2

https://github.com/paboyle/Grid/blob/develop/Grid/util/Init.cc

Has some of the sort of gunk you need for trapping SIGFPE, enabling FPE exceptions, SEGV, SIGBUS backtrace printed and on x86 a register dump.

It compiles everywhere (MacOS and Linux at least) From what I saw of PeTSC, they've done a bit more with the sigaction, and perhaps use a little library.

https://petsc.org/release/src/sys/error/signal.c.html

void * Grid_backtrace_buffer[_NBACKTRACE];

void Grid_sa_signal_handler(int sig,siginfo_t *si,void * ptr)
{
  fprintf(stderr,"Caught signal %d\n",si->si_signo);
  fprintf(stderr,"  mem address %llx\n",(unsigned long long)si->si_addr);
  fprintf(stderr,"         code %d\n",si->si_code);
  // Linux/Posix
#ifdef __linux__
  // And x86 64bit
#ifdef __x86_64__
  ucontext_t * uc= (ucontext_t *)ptr;
  struct sigcontext *sc = (struct sigcontext *)&uc->uc_mcontext;
  fprintf(stderr,"  instruction %llx\n",(unsigned long long)sc->rip);
#define REG(A)  printf("  %s %lx\n",#A,sc-> A);
  REG(rdi);
  REG(rsi);
  REG(rbp);
  REG(rbx);
  REG(rdx);
  REG(rax);
  REG(rcx);
  REG(rsp);
  REG(rip);


  REG(r8);
  REG(r9);
  REG(r10);
  REG(r11);
  REG(r12);
  REG(r13);
  REG(r14);
  REG(r15);
#endif
#endif
  fflush(stderr);
  BACKTRACEFP(stderr);
  fprintf(stderr,"Called backtrace\n");
  fflush(stdout);
  fflush(stderr);
  exit(0);
  return;
};

void Grid_exit_handler(void)
{
  BACKTRACEFP(stdout);
  fflush(stdout);
}
void Grid_debug_handler_init(void)
{
  struct sigaction sa;
  sigemptyset (&sa.sa_mask);
  sa.sa_sigaction= Grid_sa_signal_handler;
  sa.sa_flags    = SA_SIGINFO;
  sigaction(SIGSEGV,&sa,NULL);
  sigaction(SIGTRAP,&sa,NULL);
  sigaction(SIGBUS,&sa,NULL);
  sigaction(SIGUSR2,&sa,NULL);

  feenableexcept( FE_INVALID|FE_OVERFLOW|FE_DIVBYZERO);

  sigaction(SIGFPE,&sa,NULL);
  sigaction(SIGKILL,&sa,NULL);
  sigaction(SIGILL,&sa,NULL);

  atexit(Grid_exit_handler);
}
'''

Scidac5usqcd avatar Mar 25 '24 21:03 Scidac5usqcd

Oops signed in on wrong account.... @paboyle

Scidac5usqcd avatar Mar 25 '24 21:03 Scidac5usqcd

Thanks @paboyle, I appreciate the references!

weinbe2 avatar Mar 25 '24 22:03 weinbe2

We already support https://github.com/bombela/backward-cpp, https://github.com/lattice/quda/blob/9963aec17fc87385fce4717bb74872151b786419/CMakeLists.txt#L252

Also @paboyle please don't suggest any Grid samples as this is GPL.

mathiaswagner avatar Apr 30 '24 06:04 mathiaswagner