etl
etl copied to clipboard
TCPInfo parser should include test filename, and dedup based on filename instead of UUID
Current parser/schema does not include the filename. UUID specifies a connection, but there may be more than one file for connections that are open more than 10 minutes.
The dedup step in gardener will currently keep only the file from the later tarfile, or with the most snapshots.
This is related to https://github.com/m-lab/etl-gardener/issues/158