probe-scraper
probe-scraper copied to clipboard
Refactor this Code Base
This code base is largely about:
- Scraping information about revisions
- Reading probe information from those revisions
- Combining probe information
- Writing it back out
Given that we recently moved to Python 3, this code base could use a serious uplift by utilizing some of the nice features available. For example, the probe type could be a Dataclass that knows how to compare itself to others, and knows how to serialize itself into the final JSON output.
A revision class could compare itself to other revisions, based on push-date or version. This can be used to decide first and last revisions/versions/dates for probes.
Together, these would simplify transform_probes.py.
All logic should be moved out of runner.py and integrated into appropriate places, and that should exist just to provide the scraper CLI.
Also consider: Separating out the Glean and Telemetry parsing into their own submodules. They don't really share much code except for a few naming constants.
These all seem like good ideas. I'll schedule some time to read through the code base and see if I have anything to add.
There's still some question about how this can be redesigned. Once we've finalized some of the changes we'd like to see let's take the question label off.