crash icon indicating copy to clipboard operation
crash copied to clipboard

fail to get stack frame on crash 7.2.0

Open kasks80 opened this issue 7 years ago • 10 comments

when trying to get stack frame with the command such as "bt -f" then, I could see a warning message like below "cannot determine starting stack frame for task".

and still could not get the correct callstack with its registers even though message was just a warning.

could anyone solve this problem??

Thanks,

kasks80 avatar Oct 24 '17 10:10 kasks80

----- Original Message -----

when trying to get stack frame with the command such as "bt -f" then, I could see a warning message like below "cannot determine starting stack frame for task".

What kind of dumpfile is it? What architecture?

and still could not get the correct callstack with its registers even though message was just a warning.

What do you mean by "with its registers"?

crash-utility avatar Oct 24 '17 14:10 crash-utility

What kind of dumpfile is it? What architecture? -> arch is arm64.(current I'm developing the android project) -> how can i check kind of dumpfile?

I was confused. I mean when using "bt -t" or "bt -a", I could see below things.

PID: 16645 TASK: ffffffc027fa9180 CPU: 0 COMMAND: "Binder:16621_1" bt: WARNING: cannot determine starting stack frame for task ffffffc027fa9180

actually I expected like the below result.(copy example from "help bt")

crash> bt -a
PID: 286    TASK: c0b3a000  CPU: 0   COMMAND: "in.rlogind"
#0 [c0b3be90] crash_save_current_state at c011aed0
#1 [c0b3bea4] panic at c011367c
#2 [c0b3bee8] tulip_interrupt at c01bc820
#3 [c0b3bf08] handle_IRQ_event at c010a551
#4 [c0b3bf2c] do_8259A_IRQ at c010a319
#5 [c0b3bf3c] do_IRQ at c010a653
#6 [c0b3bfbc] ret_from_intr at c0109634
   EAX: 00000000  EBX: c0e68280  ECX: 00000000  EDX: 00000004  EBP: c0b3bfbc
   DS:  0018      ESI: 00000004  ES:  0018      EDI: c0e68284
   CS:  0010      EIP: c012f803  ERR: ffffff09  EFLAGS: 00000246
#7 [c0b3bfbc] sys_select at c012f803
#8 [c0b3bfc0] system_call at c0109598
   EAX: 0000008e  EBX: 00000004  ECX: bfffc9a0  EDX: 00000000
   DS:  002b      ESI: bfffc8a0  ES:  002b      EDI: 00000000
   SS:  002b      ESP: bfffc82c  EBP: bfffd224
   CS:  0023      EIP: 400d032e  ERR: 0000008e  EFLAGS: 00000246

kasks80 avatar Oct 25 '17 01:10 kasks80

----- Original Message -----

What kind of dumpfile is it? What architecture? -> arch is arm64.(current I'm developing the android project) -> how can i check kind of dumpfile?

What is the output of "help -D"?

I'm confused. when using "bt -t" or "bt -a", i could see below things.

PID: 16645 TASK: ffffffc027fa9180 CPU: 0 COMMAND: "Binder:16621_1" bt: WARNING: cannot determine starting stack frame for task ffffffc027fa9180

actually I expected like the below result.(copy example from "help bt")

crash> bt -a
PID: 286    TASK: c0b3a000  CPU: 0   COMMAND: "in.rlogind"
#0 [c0b3be90] crash_save_current_state at c011aed0
#1 [c0b3bea4] panic at c011367c
#2 [c0b3bee8] tulip_interrupt at c01bc820
#3 [c0b3bf08] handle_IRQ_event at c010a551
#4 [c0b3bf2c] do_8259A_IRQ at c010a319
#5 [c0b3bf3c] do_IRQ at c010a653
#6 [c0b3bfbc] ret_from_intr at c0109634
   EAX: 00000000  EBX: c0e68280  ECX: 00000000  EDX: 00000004  EBP:
   c0b3bfbc
   DS:  0018      ESI: 00000004  ES:  0018      EDI: c0e68284
   CS:  0010      EIP: c012f803  ERR: ffffff09  EFLAGS: 00000246
#7 [c0b3bfbc] sys_select at c012f803
#8 [c0b3bfc0] system_call at c0109598
   EAX: 0000008e  EBX: 00000004  ECX: bfffc9a0  EDX: 00000000
   DS:  002b      ESI: bfffc8a0  ES:  002b      EDI: 00000000
   SS:  002b      ESP: bfffc82c  EBP: bfffd224
   CS:  0023      EIP: 400d032e  ERR: 0000008e  EFLAGS: 00000246

--

The WARNING message comes from here in arm64.c, where it is trying to determine the starting pc and sp register values:

static void arm64_get_stack_frame(struct bt_info *bt, ulong *pcp, ulong *spp) { int ret; struct arm64_stackframe stackframe = { 0 };

    if (DUMPFILE() && is_task_active(bt->task))
            ret = arm64_get_dumpfile_stackframe(bt, &stackframe);
    else
            ret = arm64_get_stackframe(bt, &stackframe);

    if (!ret)
            error(WARNING,
                    "cannot determine starting stack frame for task %lx\n",
                            bt->task);

    bt->frameptr = stackframe.fp;
    if (pcp)
            *pcp = stackframe.pc;
    if (spp)
            *spp = stackframe.sp;

}

Depending upon whether the "Binder:16621_1" task was active at the time of the crash, it failed either in arm64_get_dumpfile_stackframe() or arm64_get_stackframe(). If it was active, I'm presuming that it failed at the top of arm64_get_dumpfile_stackframe():

static int arm64_get_dumpfile_stackframe(struct bt_info *bt, struct arm64_stackframe *frame) { struct machine_specific *ms = machdep->machspec; struct arm64_pt_regs *ptregs;

    if (!ms->panic_task_regs ||
        (!ms->panic_task_regs[bt->tc->processor].sp &&
         !ms->panic_task_regs[bt->tc->processor].pc)) {
            bt->flags |= BT_REGS_NOT_FOUND;
            return FALSE;
    }

...

You can determine whether the ms->panic_task_regs were set appropriately like this:

crash> help -m ... [ cut ] ... panic_task_regs: 39f3410 ...

If it is non-zero (a user-space address), then perhaps the .sp or .pc values for cpu 0 are NULL. Those values get set during initialization by the arm_get_crash_notes() function. You could debug that function to see whether they are getting set. It looks like there would have been some type of warning message during initialization if it failed, but you didn't send the output of the command.

Dave

crash-utility avatar Oct 25 '17 13:10 crash-utility

I also meet such problem. on ARM64 platform I also can't restore the task bt; crash> help -m ... [ cut ] ... panic_task_regs: 0 ... but actually I get restore the pc, sp ptr from kmsg print or other way, is there any way to set the x0~x31 and PC, SP ptr by hand? and then the bt can work fine?

8yanghao avatar Apr 22 '19 14:04 8yanghao

----- Original Message -----

I also meet such problem. on ARM64 platform I also can't restore the task bt; crash> help -m ... [ cut ] ... panic_task_regs: 0 ... but actually I get restore the pc, sp ptr from kmsg print or other way, is there any way to set the x0~x31 and PC, SP ptr by hand? and then the bt can work fine?

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/crash-utility/crash/issues/22#issuecomment-485438315

You can try using the bt command's "-I -S " options and try to make it work. The two options are designed for the x86_86 architecture, but it's worth a try. For example, take the pc and sp arguments from the log:

[ 744.483512] pc : machine_kexec+0x58/0x3e8 [ 744.531499] lr : machine_kexec+0x58/0x3e8 [ 744.579486] sp : ffff000011b4f6a0

and then try "bt -I machine_kexec+0x58" -S ffff000011b4f6a0". If that doesn't work, try adding or subtracting 8 from the stack address.

Dave

crash-utility avatar Apr 23 '19 13:04 crash-utility

in arm64,the crash tool check pstate

slowpy avatar Jul 09 '20 02:07 slowpy

I also meet such problem. on ARM64 platform I also can't restore the task bt; crash> help -m ... [ cut ] ... panic_task_regs: 0 ... but actually I get restore the pc, sp ptr from kmsg print or other way, is there any way to set the x0~x31 and PC, SP ptr by hand? and then the bt can work fine?

hi

i already debug the function, seems no crash_notes symbol, did you side fix the issue?

Thx.

taigerhu avatar Oct 28 '20 08:10 taigerhu

Hi guys,

I was able to get the bt(still with one error message) by adding/subtracting(yes, both worked) 8 from sp on arm64. The printed register values are not correct though.

image

P.S: help -D and help -m output is attached. help_D.txt help_m.txt

praton1729 avatar Aug 13 '21 09:08 praton1729

Hi guys, I also meet such problem. on ARM64 platform. Adding/subtracting 8 from sp is does not work yet.

Corefile type : vmcore_data panic_task_regs: 0

crash> bt 1 PID: 1 TASK: ffffff95c514bd40 CPU: 7 COMMAND: "init" bt: WARNING: cannot determine starting stack frame for task ffffff95c514bd40 crash> bt -I 0x00000558f52a1a4 -S 0x000007fca225500 1 PID: 1 TASK: ffffff95c514bd40 CPU: 7 COMMAND: "init" bt: non-process stack address for this task: 7fca225500 (valid range: ffffffc010060000 - ffffffc010064000) crash> bt -I 0x00000558f52a1a4 -S 0x000007fca225508 1 PID: 1 TASK: ffffff95c514bd40 CPU: 7 COMMAND: "init" bt: non-process stack address for this task: 7fca225508 (valid range: ffffffc010060000 - ffffffc010064000)

Does anyone have any ideas to solve this problem?

Thanks Very Much

YuanyeMa avatar Sep 01 '22 02:09 YuanyeMa

When the crash is just started, the following print is displayed. Does it matter?

WARNING: cpu 0: cannot find NT_PRSTATUS note WARNING: cpu 1: cannot find NT_PRSTATUS note WARNING: cpu 2: cannot find NT_PRSTATUS note WARNING: cpu 3: cannot find NT_PRSTATUS note WARNING: cpu 4: cannot find NT_PRSTATUS note WARNING: cpu 5: cannot find NT_PRSTATUS note WARNING: cpu 6: cannot find NT_PRSTATUS note WARNING: cpu 7: cannot find NT_PRSTATUS note

Also, is the issue related to zram enabled in the kernel?

Thanks.

YuanyeMa avatar Sep 02 '22 08:09 YuanyeMa