lnav icon indicating copy to clipboard operation
lnav copied to clipboard

Crash whenever opening a (large) logfile and pressing `:`

Open rggjan opened this issue 3 years ago • 13 comments

I’m using lnav 0.11.0 on MacOS (M1) Opening lnav on a large file works fine with: lnav ~/Downloads/file.txt However, as soon as I try to enter a commant, when I press : lnav crashes:

❯ lnav ~/Downloads/file.txt


==== GURU MEDITATION ====
Unfortunately, lnav has crashed, sorry for the inconvenience.

You can help improve lnav by sending the following file to [email protected] :
  /Users/jrueegg/.config/lnav/crash-2022-09-07-14-16-33.5793.log
=========================

Would you like to attach a debugger? (y/N) n

Inside the log file, I can see these lines just before it crashes:

│2022-09-07T14:16:33.717 D t0 hotkeys.cc:172 executing key sequence x3a: :prompt command                                                                                                                                                          │
│2022-09-07T14:16:33.717 I t0 command_executor.cc:152 Executing: prompt command                                                                                                                                                                   │
│2022-09-07T14:16:33.765 E t0 lnav_log.cc:420 Received signal: 11

running it in lldb, I get:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x16f603fd0)
    frame #0: 0x0000000100022aa0 lnav`std::__1::__list_imp<data_parser::element, std::__1::allocator<data_parser::element> >::clear()
lnav`std::__1::__list_imp<data_parser::element, std::__1::allocator<data_parser::element> >::clear:
->  0x100022aa0 <+0>:  stp    x22, x21, [sp, #-0x30]!
    0x100022aa4 <+4>:  stp    x20, x19, [sp, #0x10]
    0x100022aa8 <+8>:  stp    x29, x30, [sp, #0x20]
    0x100022aac <+12>: add    x29, sp, #0x20
Target 0: (lnav) stopped.

This happens for most of our logfiles, although with small sample logfiles everything works as expected.

rggjan avatar Sep 13 '22 07:09 rggjan

Not sure if this is related to: https://github.com/tstack/lnav/issues/490

rggjan avatar Sep 13 '22 07:09 rggjan

Since you're already running it in lldb, can you try getting a full backtrace (enter bt).

tstack avatar Sep 13 '22 13:09 tstack

Sure, here you go: bt.txt

Seems like a pretty deep stacktrace to me... 😵‍💫

rggjan avatar Sep 13 '22 13:09 rggjan

Yeah, the deepness would be the problem, it's probably stuck in a loop and overflowing the stack.

tstack avatar Sep 13 '22 14:09 tstack

Is there anything else I can do to help debug this?

It seems to be related to the length of the log file. If I deleted the beginning or the end of the log file, it works fine, suggesting that the length (and content?) of the file is the issue and not one single log line.

The log is about 2000 lines of mostly this kind of content (lots of console color highlighting commands):

[32m2022-08-31 15:47:16.637[0m [34mfoo.bar.baz:3434[5252][0m [1;30mINFO[0m Some log message

Could all the open brackets [ of the ANSI color highlighting be the issue? (There's sometimes GB sized logs with these kinds of messages)

rggjan avatar Sep 14 '22 07:09 rggjan

@rggjan Thanks for the extra info! Is this log actually recognized as a log by lnav? It's good to know there's ANSI sequences in the log.

tstack avatar Sep 14 '22 14:09 tstack

What exactly do you mean as recognized? But yes, I think so. At least it opens, nicely shows all the colors defined by the ANSI sequences, lets me navigate etc if I don't press :.

rggjan avatar Sep 14 '22 14:09 rggjan

What exactly do you mean as recognized?

In the top left corner, does it show "LOG" like this:

Screen Region 2022-09-14 at 07 31 35

Note that the third "breadcrumb" there is the log format name, syslog_log in this case.

Or, does it show "TEXT" ?

tstack avatar Sep 14 '22 14:09 tstack

Yes, it is shown as LOG and then block_log

rggjan avatar Sep 14 '22 15:09 rggjan

Okay, so lnav is not parsing log messages with ANSI sequences around the timestamp correctly. There's machinery for converting ANSI sequences, that's why you see the colors displayed correctly. However, that machinery is not used during indexing. I'll need a couple days to make changes to process the ANSI during indexing and do it in a performant manner.

I am also able to replicate the crash with my own test data.

tstack avatar Sep 14 '22 16:09 tstack

I've pushed a change that should fix the crash but there are still other parts of lnav that need to be updated to handle the escape sequences correctly. I'm not sure if you're able to compile or not, but the top-of-tree now depends on pcre2. So, you'll need to make sure that is installed.

tstack avatar Sep 15 '22 04:09 tstack

Yes, I managed to compile and test. It seems to work fine with the fixes, thanks a lot!

There still seems to be a crash whenever I exit lnav with q, but it's not such a big issue (should I create a new issue in the tracker for that?):

2558   │ 2022-09-15T09:31:44.405 I t0 lnav.indexing.cc:263 closed log file: /Users/jrueegg/lib/lnav/test/logfile_syslog.0
2559   │ 2022-09-15T09:31:44.405 D t0 logfile_sub_source.cc:632 forced to full rebuild
2560   │ 2022-09-15T09:31:44.405 D t0 logfile_sub_source.cc:668 0: file closed, doing full rebuild
2561   │ 2022-09-15T09:31:44.405 D t0 logfile_sub_source.cc:1030 redoing search
2562   │ 2022-09-15T09:31:44.406 W t0 sql_util.cc:591 (28) close deferred due to unfinalized statement: "
2563   │ SELECT tbl_name FROM sqlite_master WHERE sql LIKE 'CREATE VIRTUAL TABLE%'
2564   │ "
2565   │ 2022-09-15T09:31:44.406 E t0 sql_util.cc:591 (21) API call with invalid database connection pointer
2566   │ 2022-09-15T09:31:44.406 E t0 sql_util.cc:593 failed postcondition `code != 21'
2567   │ 2022-09-15T09:31:44.406 E t0 lnav_log.cc:414 Received signal: 6

and

(lldb) run
Process 75794 launched: '/Users/jrueegg/lib/lnav/src/lnav' (arm64)
Process 75794 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x000000019756f224 libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
->  0x19756f224 <+8>:  b.lo   0x19756f244               ; <+40>
    0x19756f228 <+12>: pacibsp
    0x19756f22c <+16>: stp    x29, x30, [sp, #-0x10]!
    0x19756f230 <+20>: mov    x29, sp
Target 0: (lnav) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x000000019756f224 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001975a5cec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001974a7a50 libsystem_c.dylib`raise + 32
    frame #3: 0x00000001003a4c70 lnav`log_abort() at lnav_log.cc:611:5 [opt]
    frame #4: 0x00000001002ac360 lnav`sqlite_logger(dummy=<unavailable>, code=21, msg=<unavailable>) at sql_util.cc:593:5 [opt]
    frame #5: 0x000000019d610834 libsqlite3.dylib`sqlite3_log + 196
    frame #6: 0x000000019d68f9c0 libsqlite3.dylib`sqlite3_db_release_memory + 240
    frame #7: 0x000000019d609638 libsqlite3.dylib`sqlite3VdbeReset + 172
    frame #8: 0x000000019d6094e8 libsqlite3.dylib`sqlite3_finalize + 84
    frame #9: 0x000000010002027c lnav`final_action<main::$_11>::~final_action() [inlined] auto_mem<sqlite3_stmt, &(free)>::reset(this=0x000000016fdfd370, ptr=0x0000000000000000) at auto_mem.hh:136:17 [opt]
    frame #10: 0x000000010002026c lnav`final_action<main::$_11>::~final_action() [inlined] auto_mem<sqlite3_stmt, &(free)>::~auto_mem(this=0x000000016fdfd370) at auto_mem.hh:87:25 [opt]
    frame #11: 0x000000010002026c lnav`final_action<main::$_11>::~final_action() [inlined] auto_mem<sqlite3_stmt, &(free)>::~auto_mem(this=0x000000016fdfd370) at auto_mem.hh:87:17 [opt]
    frame #12: 0x000000010002026c lnav`final_action<main::$_11>::~final_action() [inlined] main::$_11::operator(this=<unavailable>)() const at lnav.cc:2163:5 [opt]
    frame #13: 0x000000010001ff30 lnav`final_action<main::$_11>::~final_action() [inlined] final_action<main::$_11>::~final_action(this=<unavailable>) at lnav_util.hh:205:9 [opt]
    frame #14: 0x000000010001ff30 lnav`final_action<main::$_11>::~final_action(this=0x000000016fdfdc68) at lnav_util.hh:204:5 [opt]
    frame #15: 0x0000000100009c10 lnav`main(argc=0, argv=0x0000000000000000) at lnav.cc:3248:1 [opt]
    frame #16: 0x000000019727fe50 dyld`start + 2544

(it's also possible there's still some compilation / linking issue in my setup causing this)

rggjan avatar Sep 15 '22 07:09 rggjan

Thanks for the extra bug report! I think it was a dumb mistake in some cleanup code, I've pushed fadcea440322847cf96ff2db72087df5af9a8062 to try and address it.

tstack avatar Sep 15 '22 13:09 tstack

Just tested the newest version (on master: 2ad379c4546cb2857d42c2901a0717e6f537ff39).

Opening my large log file (which took less than a second in the old version to open and display colored log lines) is now hanging for 2 minutes already (I think) in:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00000001a58d3f28 libsystem_platform.dylib`_platform_memmove + 88
    frame #1: 0x0000000100d00704 lnav`ptime_b_slow(dst=0x000000016f323a40, str="[LONG_LOG_STRING_REDACTED]"..., time_len=3675047, time_fmt=0x00000001010ae048, tm_out=0x000000016f323a40, tv_out=0x000000016f323a30, convert_local=true) at date_time_scanner.cc:142:17 [opt]
    frame #5: 0x0000000100e71728 lnav`data_scanner::tokenize2(this=0x000000016f323cc0) at data_scanner_re.re:65:27 [opt]
    frame #6: 0x0000000100b966ac lnav`lnav::document::structure_walker::walk(this=0x000000016f323ca8) at document.sections.cc:258:50 [opt]
    frame #7: 0x0000000100b96604 lnav`lnav::document::discover_structure(al=<unavailable>, lr=<unavailable>) at document.sections.cc:518:37 [opt]
    frame #8: 0x0000000100cd9a4c lnav`logfile_sub_source::text_crumbs_for_line(this=0x0000000101107378, line=289145, crumbs=size=4) at logfile_sub_source.cc:2257:36 [opt]
    frame #9: 0x0000000100dd9eb0 lnav`lnav_crumb_source() at view_helpers.cc:1229:14 [opt]
    frame #10: 0x0000000100b78bc0 lnav`breadcrumb_curses::do_update() [inlined] std::__1::__function::__value_func<std::__1::vector<breadcrumb::crumb, std::__1::allocator<breadcrumb::crumb> > ()>::operator(this=0x00000001010ef2d0)() const at function.h:505:16 [opt]
    frame #11: 0x0000000100b78ba8 lnav`breadcrumb_curses::do_update() [inlined] std::__1::function<std::__1::vector<breadcrumb::crumb, std::__1::allocator<breadcrumb::crumb> > ()>::operator(this= Function = lnav_crumb_source() )() const at function.h:1182:12 [opt]
    frame #12: 0x0000000100b78ba8 lnav`breadcrumb_curses::do_update(this=0x00000001010ef290) at breadcrumb_curses.cc:62:53 [opt]
    frame #13: 0x0000000100af3efc lnav`looper() at lnav.cc:1574:29 [opt]
    frame #14: 0x0000000100ae373c lnav`main(argc=0, argv=0x0000000000000000) at lnav.cc:3179:17 [opt]
    frame #15: 0x00000001a557fe50 dyld`start + 2544

For a small log file it opens after a couple of seconds (still very slow). Also, every time I press : to enter a command it takes a couple of seconds.

And exiting still crashes in:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00000001a586f224 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a58a5cec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001a57a7a50 libsystem_c.dylib`raise + 32
    frame #3: 0x00000001003d215c lnav`log_abort() at lnav_log.cc:611:5 [opt]
    frame #4: 0x00000001002d5bcc lnav`sqlite_logger(dummy=<unavailable>, code=21, msg=<unavailable>) at sql_util.cc:593:5 [opt]
    frame #5: 0x00000001ab910834 libsqlite3.dylib`sqlite3_log + 196
    frame #6: 0x00000001ab98f9c0 libsqlite3.dylib`sqlite3_db_release_memory + 240
    frame #7: 0x00000001ab909638 libsqlite3.dylib`sqlite3VdbeReset + 172
    frame #8: 0x00000001ab9094e8 libsqlite3.dylib`sqlite3_finalize + 84
    frame #9: 0x00000001002ee66c lnav`user_msg_stmt::~user_msg_stmt() [inlined] auto_mem<sqlite3_stmt, &(free)>::reset(this=0x0000000100628ea8, ptr=0x0000000000000000) at auto_mem.hh:136:17 [opt]
    frame #10: 0x00000001002ee65c lnav`user_msg_stmt::~user_msg_stmt() [inlined] auto_mem<sqlite3_stmt, &(free)>::~auto_mem(this=0x0000000100628ea8) at auto_mem.hh:87:25 [opt]
    frame #11: 0x00000001002ee65c lnav`user_msg_stmt::~user_msg_stmt() [inlined] auto_mem<sqlite3_stmt, &(free)>::~auto_mem(this=0x0000000100628ea8) at auto_mem.hh:87:17 [opt]
    frame #12: 0x00000001002ee65c lnav`user_msg_stmt::~user_msg_stmt() [inlined] prepared_stmt::~prepared_stmt(this=0x0000000100628ea8) at sqlitepp.client.hh:105:8 [opt]
    frame #13: 0x00000001002ee65c lnav`user_msg_stmt::~user_msg_stmt() [inlined] prepared_stmt::~prepared_stmt(this=0x0000000100628ea8) at sqlitepp.client.hh:105:8 [opt]
    frame #14: 0x00000001002ee65c lnav`user_msg_stmt::~user_msg_stmt() [inlined] user_msg_stmt::~user_msg_stmt(this=0x0000000100628ea8) at top_status_source.cc:89:8 [opt]
    frame #15: 0x00000001002ee65c lnav`user_msg_stmt::~user_msg_stmt(this=0x0000000100628ea8) at top_status_source.cc:89:8 [opt]
    frame #16: 0x00000001a579105c libsystem_c.dylib`__cxa_finalize_ranges + 464
    frame #17: 0x00000001a5790df0 libsystem_c.dylib`exit + 44
    frame #18: 0x00000001a58bef7c libdyld.dylib`dyld4::LibSystemHelpers::exit(int) const + 20
    frame #19: 0x00000001a557fe9c dyld`start + 2620

rggjan avatar Sep 19 '22 05:09 rggjan

Sorry for the perf regression, I've pushed a change that should address it.

tstack avatar Sep 19 '22 08:09 tstack

Thanks, yes it's fixed :)

The two remaining issues I spotted:

  • Crash on lnav stop (see above)
  • When lnav finishes, something is wrong in the console. Eg pressing "Ctrl+C" doesn't cancel a zsh command anymore, but instead just prints: ^C in the console window. Also all command output (like ls) is very weird. I have to restart the terminal to fix this.

rggjan avatar Sep 19 '22 09:09 rggjan

What version of sqlite are you using?

tstack avatar Sep 19 '22 14:09 tstack

Hmm... I'm not sure. I ran configure without any special flags for SQL.

However, rerunning with configure with --with-sqlite3='brew --prefix sqlite3' actually seems to have fixed both issues mentioned above :)

rggjan avatar Sep 19 '22 16:09 rggjan

I suspect the sqlite version that was being used lacked the sqlite3_drop_modules() function, which helped to avoid the crash problem. The crash also left your terminal in a bad state, which is why using the brew version fixed both issues.

I have pushed a final change related to dealing with log messages with ansi escapes. If you are able to update and exercise it a bit, I would be grateful. If you don't find anymore issues, then I'd like to close this out.

tstack avatar Sep 20 '22 03:09 tstack

Thanks!

One more question: How can I define custom log formats for log files with these control characters? Do they need to be part of the regex? (I tried to define something, but lnav always picks block_log over my custom format, and I don't know why...)

rggjan avatar Sep 20 '22 07:09 rggjan

How can I define custom log formats for log files with these control characters?

The control characters will now be filtered out before the regexes in the formats are applied, so you shouldn't try to match them.

(I tried to define something, but lnav always picks block_log over my custom format, and I don't know why...)

The block_log format is pretty simple, it just looks for a timestamp followed by some text. Here's an example:

https://github.com/tstack/lnav/blob/20a06a2782d4b720c3e4faad21c4b15860fbfc00/test/logfile_block.1#L1-L4

Is it possible the very start of your log file looks like that? lnav will use the first format that matches the first line in a file.

Would you be able to share the format and a chunk of the first part of the log file? If you don't want to attach it to the bug, you can send it to [email protected]. You can also try the new "anonymization" feature in the top-of-tree that will automatically replace a lot of the values with random data. The following command line will mark the first 100 lines in the given log file and write them to stdout:

$ lnav -n -c ';UPDATE all_logs SET log_mark = 1 WHERE log_line < 100' -c ':write-to --anonymize -' /path/to/log

tstack avatar Sep 20 '22 08:09 tstack

Thanks, I'll have a look at the log stuff!

Meanwhile, I just compiled the latest version (20a06a2782d4b720c3e4faad21c4b15860fbfc00) again. Unfortunately, some issues remaining:

  • Opening the large log file takes again about 5 seconds
  • Then, every time pressing : takes about 10 seconds to get to the command prompt
  • Then, every letter press (like : then q) takes 5 - 10 seconds
  • Also, still crashing when exiting:
...
    frame #8006: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8007: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8008: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8009: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8010: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8011: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8012: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8013: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8014: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8015: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8016: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8017: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8018: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8019: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8020: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8021: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8022: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8023: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8024: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8025: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8026: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8027: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8028: 0x000000010021b7ec lnav`data_parser::pairup(this=<unavailable>, schema=<unavailable>, pairs_out=<unavailable>, in_list=<unavailable>, group_depth=<unavailable>) at data_parser.cc:72:19 [opt]
    frame #8029: 0x000000010019caa8 lnav`log_data_helper::parse_line(this=0x000000016fdfab58, line=<unavailable>, allow_middle=<unavailable>) at log_data_helper.cc:95:27 [opt]
    frame #8030: 0x0000000100132224 lnav`command_prompt(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&) [inlined] log_data_helper::parse_line(this=0x000000016fdfab58, line=<unavailable>, allow_middle=true) at log_data_helper.hh:57:22 [opt]
    frame #8031: 0x00000001001321f4 lnav`command_prompt(args=size=2) at lnav_commands.cc:4566:13 [opt]
    frame #8032: 0x0000000100114f3c lnav`com_prompt(exec_context&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&) [inlined] std::__1::__function::__value_func<void (std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&)>::operator(this=<unavailable>, __args=size=2)(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&) const at function.h:505:16 [opt]
    frame #8033: 0x0000000100114f24 lnav`com_prompt(exec_context&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&) [inlined] std::__1::function<void (std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&)>::operator(this=<unavailable>, __arg=size=2)(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&) const at function.h:1182:12 [opt]
    frame #8034: 0x0000000100114f24 lnav`com_prompt(ec=0x000000016fdfb7d8, cmdline="prompt command", args=size=2) at lnav_commands.cc:4852:9 [opt]
    frame #8035: 0x00000001000a9a18 lnav`execute_command(ec=0x000000016fdfb7d8, cmdline="prompt command") at command_executor.cc:164:23 [opt]
    frame #8036: 0x00000001000b0808 lnav`execute_any(ec=0x000000016fdfb7d8, cmdline_with_mode=":prompt command") at command_executor.cc:778:22 [opt]
    frame #8037: 0x00000001000feda8 lnav`handle_keyseq(keyseq=<unavailable>) at hotkeys.cc:174:19 [opt]
    frame #8038: 0x00000001000ff9bc lnav`handle_paging_key(ch=58) at hotkeys.cc:214:9 [opt]
    frame #8039: 0x0000000100105f40 lnav`input_dispatcher::new_input(timeval const&, int) [inlined] std::__1::__function::__value_func<bool (int)>::operator(this=0x0000000100636088, __args=0x000000016fdfc3c8)(int&&) const at function.h:505:16 [opt]
    frame #8040: 0x0000000100105f28 lnav`input_dispatcher::new_input(timeval const&, int) [inlined] std::__1::function<bool (int)>::operator(this= Function = handle_key(int) , __arg=58)(int) const at function.h:1182:12 [opt]
    frame #8041: 0x0000000100105f28 lnav`input_dispatcher::new_input(this=0x0000000100636068, current_time=<unavailable>, ch=58) at input_dispatcher.cc:130:31 [opt]
    frame #8042: 0x000000010001e00c lnav`looper() at lnav.cc:1670:55 [opt]
    frame #8043: 0x000000010000d3d4 lnav`main(argc=0,

rggjan avatar Sep 20 '22 08:09 rggjan

Are you saying you didn't see this slowness when you were testing in this comment https://github.com/tstack/lnav/issues/1057#issuecomment-1251223646 ?

tstack avatar Sep 20 '22 08:09 tstack

After this fix, everything was working fine in terms of performance: https://github.com/tstack/lnav/commit/b2abf94c2e76972d7089c7d299496d284fe18aa2

rggjan avatar Sep 20 '22 11:09 rggjan

Did a quick test, this commit introduced (again) the performance regression: e0ecbdff10129203cd203379550f189739398176

rggjan avatar Sep 20 '22 11:09 rggjan

I think we need to get your log format working correctly. I'm concerned that lnav is treating the entire file as one long log message instead of many separate ones. If you can send me some file data, that would help a lot.

tstack avatar Sep 20 '22 13:09 tstack

OK, I did some debugging and testing. The following log file will be parsed as a block_log and have the performance issues I mentioned:

Wed Sep  7 08:58:33 UTC 2022
2022-09-07 09:04:15,609 [37m967[0m [32;1mINFO[0m     foo/bar/baz:85      some_function_name              [32mSome not very long log output text. And some more text. And even more.[0m
2022-09-07 09:04:15,609 [37m967[0m [32;1mINFO[0m     foo/bar/baz:85      some_function_name              [32mSome not very long log output text. And some more text. And even more.[0m
2022-09-07 09:04:15,609 [37m967[0m [32;1mINFO[0m     foo/bar/baz:85      some_function_name              [32mSome not very long log output text. And some more text. And even more.[0m
...

(with the second line repeated 200'000 times, to get a 36M log file)

rggjan avatar Sep 20 '22 14:09 rggjan

Yes, that one timestamp at the top is what lnav is locking onto. Is that written out when the file is created? Having that timestamp there makes things a bit challenging. Something to try is to disable the block_log by patching it with something like the following:

$ cat > block_log-patch.json <<EOF
{
    "block_log": {
        "file-pattern": "non-existent"
    }
}
EOF
$ lnav -i block_log-patch.json

That will restrict it to only match certain file names ("non-existent" in this case). In your format file, you'll need to add a separate regex for that initial timestamp. Make sure it only matches a line with just the timestamp, otherwise it will be used for other log files.

tstack avatar Sep 20 '22 15:09 tstack

So it's not just a timestamp (that was my simplified example). But there's a bunch of "setup" log lines at the beginning without any structured format, and then the actual log lines (with the format with the ASCII color) starts.

Is there a way to force using a certain log format on the command line? And just let it ignore the lines it doesn't recognize? (It's possible there's some "wrong" lines also later in the logs, even if 95% of the logs are of the specific format).

rggjan avatar Sep 21 '22 07:09 rggjan

Is there a way to force using a certain log format on the command line?

See https://github.com/tstack/lnav/issues/454

piotr-dobrogost avatar Sep 21 '22 08:09 piotr-dobrogost