chalk icon indicating copy to clipboard operation
chalk copied to clipboard

Optimize the Chalk startup to cache the base config at compile time

Open ee7 opened this issue 2 years ago • 2 comments

There is currently a little delay when running e.g. chalk help, because chalk always performs various setup tasks, regardless of the user-specified command:

https://github.com/crashappsec/chalk/blob/175f32f001d90de02c3efa10d19467f08e98a1a7/src/chalk.nim#L13-L24

Does chalk want to always perform every setup task, or could it skip some of them when the command is e.g. help?

It's clear that chalk help is currently doing a lot of stuff:

$ strace --summary-only --summary-columns=time-percent,calls,errors,name chalk help > /dev/null
% time     calls    errors syscall
------ --------- --------- ----------------
 97.75     34980           read
  1.00       350           lseek
  0.49       106           mmap
  0.45        66           munmap
  0.05        15         1 rt_sigaction
  0.04         6           readlink
  0.04        57           getcwd
  0.03        22        15 stat
  0.02         5           rt_sigprocmask
  0.02        11         3 ioctl
  0.02         3           brk
  0.02         9           lstat
  0.02        13           readv
  0.01         6         1 open
  0.01         5           close
  0.01         9           fstat
  0.01        10           fcntl
  0.00         6           getpid
  0.00         2           writev
  0.00         1           uname
  0.00         1           getgid
  0.00         1           getegid
  0.00         1           execve
  0.00         1           getuid
  0.00         2           geteuid
  0.00         1           getgroups
  0.00         1           arch_prctl
  0.00         1           set_tid_address
------ --------- --------- ----------------
100.00     35691        20 total

For example, opening:

  • /usr/bin/docker
  • /home/foo/.local/share/bash_completion/completions/chalk.bash
  • /usr/local/ssl/openssl.cnf

By the way, is this an expected number of syscalls? (I'm still new to chalk).


Chalk version: 175f32f001d90de02c3efa10d19467f08e98a1a7

ee7 avatar Oct 02 '23 21:10 ee7

Yes, a lot of the setup actually is in the config file loading, where there is, for instance, a tremendous amount of documentation... certainly all of the documentation that's available through the help command. That's responsible for most of those reads.

Most of it is stuff that should never change per compile of Chalk. The intent has been to have all of that run, then CACHE the con4m state, and put THAT into the binary instead. Then, the only startup code that would need to run is the code that's specific to an individual chalk run, which would include a tiny bit of the builtin stuff, and then the user's config... which will never been anywhere near the 12K lines or so of stuff.

I'd expect this to reduce 95%+ of the startup.

This project isn't too overly hard, it just hasn't made the list yet (though when running in debug mode the delay is particularly obnoxious).

At some future point beyond that, we could also cache user config that's not run-dependent too, but again, I wouldn't expect that to be worth doing anytime soon.

viega avatar Oct 02 '23 21:10 viega

Just to elaborate a little bit on what needs to be done for this all to work for Chalk (might not be the best general-purpose thing for Con4m, but is a good enough initial solution):

  1. There needs to be a call that marshals the current state of the config so we can start with it pre-executed.
  2. Function Implementations need to be cached in case they're used by callbacks, config code that runs later, etc.
  3. A bunch of the configuration state data structure, including the con4m validation runtime should probably be marshalled, so that we can validate any user-config without having to reload everything. But we could skip this if there's no external config file applied, and no new config loaded in an operation.

viega avatar Oct 02 '23 21:10 viega