archive-query-log
archive-query-log copied to clipboard
📜 The Archive Query Log.
Updates the requirements on [pytest](https://github.com/pytest-dev/pytest) to permit the latest version. Release notes Sourced from pytest's releases. 9.0.0 pytest 9.0.0 (2025-11-05) New features #1367: Support for subtests has been added. subtests...
Updates the requirements on [fastapi](https://github.com/fastapi/fastapi) to permit the latest version. Release notes Sourced from fastapi's releases. 0.121.0 Features ✨ Add support for dependencies with scopes, support scope="request" for dependencies with...
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 6. Release notes Sourced from actions/download-artifact's releases. v6.0.0 What's Changed BREAKING CHANGE: this update supports Node v24.x. This is not a breaking change per-se but...
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 5. Release notes Sourced from actions/upload-artifact's releases. v5.0.0 What's Changed BREAKING CHANGE: this update supports Node v24.x. This is not a breaking change per-se but...
- [ ] #106 - [ ] Parquet export of SERPs (+ export to HF datasets) - [ ] Parquet export of results (+ export to HF datasets) - [...
- [x] Migrate from elasticsearch-dsl to (type-safe) elasticsearch-pydantic - [x] Restructured parsing to use hard-coded (i.e., unit-testable) parsers - [x] Migrated monitoring (Flask → FastAPI, now more flexible API) -...
To make the exports more safe, we should block known phishing sites etc. Michael and Sebastian have pointed me to these blocklists as used by the OWS crawling: - https://github.com/Ultimate-Hosts-Blacklist/Ultimate.Hosts.Blacklist...
- [ ] What format to use? -> Turtle? - [ ] First we just want to export from elastic - [ ] Later export more metadata from the SERPs,...