Frank Bertsch

Results 35 comments of Frank Bertsch
trafficstars

This will also require reading from new streaming data sources.

@Dexterp37 I do! https://github.com/mozilla/probe-scraper/pull/214

Here is the list of invalid URLs being fetched by probe-scraper: https://gist.github.com/fbertsch/f0d27f697dec888e1e7ed88a048b2ad3

cc @mdboom you mentioned your team was interested in working on bugs here, is this something you all would have the bandwidth to take on?

Indeed it does. The cache is here: https://github.com/mozilla/probe-scraper/blob/master/probe_scraper/runner.py#L300

Agreed. Running P-S in a fresh cache location would be a good start (IIRC this should take 5-6 hours), and will hit M-C a _bunch_. Are you up for that,...

Ah, gotcha. You'd need AWS creds to run this with a separate bucket. Sounds like this investigation needs to happen on more the ops side.

> Do I need to provide a repo if there isn't a metrics.yaml? What if it's a mercurial repo instead of Git? Does probe-scraper know about pings.yaml files? We can...

Currently we ignore historical `metrics.yaml` files that are not compatible with the current version of `glean_parser`.

@mdboom that does make sense. What I would prefer is for the probe-scraper not to have to deal with any of that, and instead have the `metrics.yaml` contain an optional...