ThreatExchange
ThreatExchange copied to clipboard
[py-tx] New extension interface for storage
We want to the ability to add new storage mechanisms as an alternative to the one that comes installed by default in py-tx. We think that dbm might be a much better default storage
Pre-read material:
- Readme: https://github.com/facebook/ThreatExchange/tree/main/python-threatexchange
- SignalExchange interface (especially storage): https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/exchanges/signal_exchange_api.py#L20
- Backwards compatibility guarantee: https://github.com/facebook/ThreatExchange/tree/main/python-threatexchange#general-expectation-for-compatibility-and-versioning
-
dbm
module: https://docs.python.org/3/library/dbm.html
There will be a series of milestones:
- We'll be defining a new python interface for what methods need to be implemented for storage, likely patterned on https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/exchanges/helpers.py#L69
- Apply the interface to the existing storage at https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/cli/cli_state.py
- Create a
dbm
implementation of the interface - Swap out the dbm version of the interface and show that it still produces the full dataset with the
dataset
command - Add storage to the extensions interface at https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/extensions/manifest.py
- Add in a configuration field to https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/cli/cli_config.py#L47 which is the selected storage mechanism. Unset should default to the old in-memory merge file storage
- Add the ability to select the storage backend with a cli command - think about edge case behavior here
- End-to-end test swapping storages / large download
- [Stretch] work with Scott at the hackathon to spec out an AWS-based storage extension. It can live in https://github.com/facebook/ThreatExchange/tree/main/python-threatexchange/threatexchange/extensions as an "official" extension