archive-query-log icon indicating copy to clipboard operation
archive-query-log copied to clipboard

Major restructuring

Open janheinrichmerker opened this issue 2 months ago • 0 comments

  • [x] Migrate from elasticsearch-dsl to (type-safe) elasticsearch-pydantic
  • [x] Restructured parsing to use hard-coded (i.e., unit-testable) parsers
  • [x] Migrated monitoring (Flask → FastAPI, now more flexible API)
  • [x] Implement web search result block landing page download (fix #6)
  • [x] Remove unused legacy code
  • [x] Updated readme
  • [x] Updated dependencies
  • [x] Updated CI
  • [x] Add dry-run option to test most CLI actions without actual writes
  • [ ] Add JSONL export (and local "fake" WARC store for testing)
  • [ ] Migrate "legacy" unit tests to work with the current parsing architecture

janheinrichmerker avatar Sep 22 '25 12:09 janheinrichmerker