etl icon indicating copy to clipboard operation
etl copied to clipboard

M-Lab ingestion pipeline

Results 105 etl issues
Sort by recently updated
recently updated
newest added

After deploying the new alternative ETL pipeline SLIs, we found that the scamper1 datatype would report parse errors after restarting: We suspected this may be due to a temporary format...

review/triage

According to Prometheus metric naming best practices, accumulating counts should end with the suffix Total https://prometheus.io/docs/practices/naming/. A lot of the accumulating count metrics in etl (e.g., PanicCount, WorkerCount) have the...

review/triage

Currently, `etl_worker` crashes in local development mode when a paris-traceroute archive is supplied as a URL. Steps to reproduce: 1. Navigate to cmd/etl_worker within the ETL project. 2. Run `go...

review/triage

Since https://github.com/m-lab/etl/pull/972, the ETL `Version` and `GitCommit` are compiled in at build time. And, the `Version` is always a human readable symbolic name; either the branch (e.g. sandbox-soltesz, master) or...

enhancement
pipeline

New parsers should NOT annotate records, as they are annotated by joins in BQ. The K8S annotation-service should be shut down, and null-annotator should be used for 2.0 parsing tasks....

pipeline

We are currently seeing a low rate of GCS storage errors: ``` 2021/04/13 04:54:19 rowwriter.go:119: googleapi: got HTTP response code 503 with body: Service Unavailable etl-mlab-staging ndt/ndt7/2020/08/27/20200827T170704.505210Z-ndt7-mlab3-lhr05-ndt.tgz.json textPayload: "2021/04/13 04:54:19...

bug
pipeline

https://github.com/m-lab/etl/blob/5caa9cbbd394ec4f0f7cd1e82eeec6b26a21525b/task/task.go#L66-L66 Currently, these errors are dropped - not reported to Gardener. Among these errors are GCS write errors, reported on https://github.com/m-lab/etl/blob/5caa9cbbd394ec4f0f7cd1e82eeec6b26a21525b/storage/rowwriter.go#L168-L169

bug
pipeline

I've recently updated the descriptions for fields in https://github.com/m-lab/etl/tree/master/schema/descriptions There are a few marked `TBD` that need definitions written in the list of files below, and existing definitions should be...

pipeline
documentation
community-tools

The first web100 download row is 2009-07-02 and the first upload is 2009-02-18, as reported by SELECT * FROM `mlab-sandbox.inspector.union_ndt_prod_all` Also reproduced: SELECT MIN(date) FROM `measurement-lab.ndt.unified_downloads` WHERE date < '2009-09-01'...

bug
pipeline

TCP RTT calculation changed from mS to uS. Is bloated test is supposed to be RTT > 1 Second.

bug
pipeline