crash icon indicating copy to clipboard operation
crash copied to clipboard

[ARM] Crash failed to parse the core header when makedumpfile is compiled with -D_TIME_BITS=64

Open zyxiaooo opened this issue 11 months ago • 5 comments

Hi,

I have ARM platforms using kernel 5.15. Recently we switched to 64 bit time and then found that the core failed to open with the following error:

crash: diskdump / compressed kdump: cannot malloc block_size buffer

All data after timestamp shift 12 bytes in the core header:

struct disk_dump_header {
        char                    signature[SIG_LEN];     /* = "DISKDUMP" */
        int                     header_version; /* Dump header version */
        struct new_utsname      utsname;        /* copy of system_utsname */
        struct timeval          timestamp;      /* Time stamp */
        uint8_t dummy[12];   <<<<<<<<<<<<<<<<<<<<<<<<<< add this will temporarily workaround the issue

I also tried to compile crash with the follow command to match the makedumpfile one:

 make target=ARM CFLAGS="-D_TIME_BITS=64"

But got another error:

WARNING: compressed kdump: invalid nr_cpus value: 0
Segmentation fault

Any idea how to correctly handle the core dump with ARM + -D_TIME_BITS=64 ?

zyxiaooo avatar Apr 01 '24 04:04 zyxiaooo

Hi,

I have ARM platforms using kernel 5.15. Recently we switched to 64 bit time and then found that the core failed to open with the following error:

crash: diskdump / compressed kdump: cannot malloc block_size buffer

The error msg is just the fail of realloc in diskdump.c:read_dump_header(), could you check the failing reason of realloc? Is it due to memory shortage or incorrect value of block_size? In addition, a strerror() may help.

All data after timestamp shift 12 bytes in the core header:

struct disk_dump_header {
        char                    signature[SIG_LEN];     /* = "DISKDUMP" */
        int                     header_version; /* Dump header version */
        struct new_utsname      utsname;        /* copy of system_utsname */
        struct timeval          timestamp;      /* Time stamp */
        uint8_t dummy[12];   <<<<<<<<<<<<<<<<<<<<<<<<<< add this will temporarily workaround the issue

Yeah, it makes sense, because the 64bit time will use larger space.

I also tried to compile crash with the follow command to match the makedumpfile one:

 make target=ARM CFLAGS="-D_TIME_BITS=64"

In my computer(fedora 38),

$ cat /usr/include/bits/types/struct_timeval.h struct timeval { #ifdef __USE_TIME_BITS64 __time64_t tv_sec; /* Seconds. / __suseconds64_t tv_usec; / Microseconds. / #else __time_t tv_sec; / Seconds. / __suseconds_t tv_usec; / Microseconds. */ #endif };

I guess(not tried) it should be "CFLAGS="-D__USE_TIME_BITS64"", in order to enable 64bit timestamp.

But got another error:

WARNING: compressed kdump: invalid nr_cpus value: 0
Segmentation fault

Segfault can represent many things. It is better to have a gdb bt stacktrace for further debug.

Any idea how to correctly handle the core dump with ARM + -D_TIME_BITS=64 ?

liutgnu avatar Apr 02 '24 02:04 liutgnu

Thanks for the reply.

crash: diskdump / compressed kdump: cannot malloc block_size buffer

Due to the header mismatch, this is because the block_size it reads is 0.

I guess(not tried) it should be "CFLAGS="-D__USE_TIME_BITS64"", in order to enable 64bit timestamp.

I also tried this but got the same error.

WARNING: compressed kdump: invalid nr_cpus value: 0
Segmentation fault

I think this is still header mismatch, because nr_cpu is not 0 in the test core. Haven't got a chance to dig further though.

zyxiaooo avatar Apr 02 '24 18:04 zyxiaooo

Yeah, the block_size == 0 is abnormal, which comes from the disk_dump_header, which coming from makedumpfile. It's better to have the vmcore, dump the disk_dump_header into hex, and verify if it is due to error of makedumpfile or kernel itself.

liutgnu avatar Apr 03 '24 02:04 liutgnu

With kernel 5.15, and a makedumpfile compiled with the -D_TIME_BITS=64, I hexdumped the the generated core header, and I can see that there are 12 bytes more around the timestamp field.

With exactly the same kernel, and a makedumpfile compiled WITHOUT -D_TIME_BITS=64, everything works fine.

So I guess there are some issue with makedumpfile with that flag.

Note that not sure if it is related, but we generate the core in flat mode first (makedumpfile -F -c), then make them back to non-flat mode (makedumpfile -R). Just let you know in case it is an issue only under this scenario.

zyxiaooo avatar Apr 03 '24 03:04 zyxiaooo

Not sure neither, sorry I cannot provide any further useful info.

liutgnu avatar Apr 03 '24 08:04 liutgnu