cisticola icon indicating copy to clipboard operation
cisticola copied to clipboard

Performance optimizations and bug fixes

Open trislee opened this issue 3 years ago • 0 comments

  • Modified langdetect detect method to decrease run-time
  • Fixed indentation error in transform_info
  • Prototyped removal of offset in transform_all_untransformed
    • This change needs modifications: it fails for the first batch, since the batch is already computed, so that only one ScraperResult.date is greater than or equal to max(batch, key=lambda v: v.date).date, so only one post is transformed.
  • Originally modified TelegramTelethonTransformer to have a self.client attribute, but this caused the transformer tests to fail, since having a TelegramTelethonScraper already initialized while initializing a TelegramTelethonTransformer causes the Telethon session database to be locked. This could be addressed by deleting the controller object in tests/transformer/telegram_telethon.py

trislee avatar Jul 01 '22 08:07 trislee