etl
etl copied to clipboard
Oldest web100 upload and download dates differ by almost 5 months
The first web100 download row is 2009-07-02 and the first upload is 2009-02-18, as reported by
SELECT * FROM mlab-sandbox.inspector.union_ndt_prod_all
Also reproduced: SELECT MIN(date) FROM measurement-lab.ndt.unified_downloads
WHERE date < '2009-09-01'
I believe the February date is closer to correct.
This most likely reflects a NDT file format change circa 2009-07-02 and a parser incompatibility with the older data.