cognita icon indicating copy to clipboard operation
cognita copied to clipboard

Parser Registry Problem

Open magaton opened this issue 1 year ago • 1 comments

Hello, I am looking at the code and it seems that a new Parser impl instance is created for each file in the datasource. For 100K files repos this looks like a significant overhead. Disclaimer: I am not a Python dev, just a random observer :)

https://github.com/truefoundry/cognita/blob/01b22433b6a25973d8bf561a23c619c5f41770e7/backend/indexer/indexer.py#L234

and

https://github.com/truefoundry/cognita/blob/main/backend/modules/parsers/parser.py#L91

Shouldn't the parser instances be created per datasource and not cached globally? WDYT?

magaton avatar Oct 09 '24 07:10 magaton