scalpel icon indicating copy to clipboard operation
scalpel copied to clipboard

Switching to html-parse for faster tokenisation (1/10th of time and peak memory allocated)

Open benjaminweb opened this issue 7 months ago • 3 comments

  • fixes all (temporarily) broken tests
  • fixes (temporarily) missing type annotation for scrapeURL
  • includes time and memory benchmark allowing comparing based on real world example with previous version 0.6.2.2
  • @fimad might require renaming of any function with Tag to Token -- would that be a breaking change?

benjaminweb avatar May 26 '25 15:05 benjaminweb