chalk icon indicating copy to clipboard operation
chalk copied to clipboard

techStackGeneric: should be faster

Open ee7 opened this issue 1 year ago • 0 comments

Description

The techStackGeneric plugin is currently a bit too slow. It'd be good to speed it up. For example:

  • [x] Profile it
  • [x] Fix bugs in getProcNames (#250)
  • [x] Stop reading files in .git (#253)
  • [ ] Fix scanDirectory reading each file multiple times
  • [ ] Stop scanning binaries
  • [ ] Fix directory walking (symlink handling, prevent possible stack overflow, etc)
  • [ ] Refactor away the unnecessary heap allocations
  • [ ] Refactor away the globals
  • [ ] Optimize the file reads. For example: consider using mmap
  • [ ] Consider whether it's worth moving from std/re to e.g. nitely/nim-regex. That's a pure Nim regex library which can be faster than PCRE for some types of regular expressions. It claims:

The match time is linear in the length of the text and the regular expression. So, it can handle input from untrusted users. The syntax is similar to PCRE but lacks a few features that can not be implemented while keeping the space/time complexity guarantees, ex: backreferences.

For now, I don't think we'd need to parallelize it.

For a release build of chalk, profiling chalk insert --use-tech-stack-detection foo in a tiny repo with callgrind:

Percentage of total execution time Module Procedure
90% techStackGeneric scanFile
which is due to:
38% std/streams readLine
22% fd_cache acquireFileStream
18% std/re find

Dependencies

None.

Subtickets

None.

ee7 avatar Mar 20 '24 10:03 ee7