spikex icon indicating copy to clipboard operation
spikex copied to clipboard

How to speed up the progress of adding patterns

Open Hunter-Leo opened this issue 3 years ago • 1 comments

  • spikex version: 0.5.0
  • Python version:
  • Operating System: linux

Description

Hey, guys. I found your tool is very powerful, thx for sharing. I met a problem that the time cost is huge, when I was trying to add 30 thousands patterns to initialize LabelX. And this progress is much slower than the spacy, so that I wonder if any solution you guys can propose?

Hunter-Leo avatar Nov 15 '21 08:11 Hunter-Leo

Hi @Hunter-Leo!

Time cost indexing patterns depends on many factors. I'm thinking that a couple of things could help in identifying where's the issue:

  • You can investigate "spying" an indexing, maybe using py-spy. In this way, we know where is most of the time consumption.
  • You can share some of patterns you're using, just to have an idea of what kind of complexity they have.

If you have any other suggestion, of course it's welcome!

paoloq avatar Jan 28 '22 13:01 paoloq