elementary
elementary copied to clipboard
`upload-source-freshness` can overflow CLI arguments max size, in some scenarios
Describe the bug
On occasion the source freshness upload, edr run-operation upload-source-freshness --project-dir . --profile-target dev
will exceed the maximum argument limit, with:
OSError: [Errno 7] Argument list too long: 'dbt'
In our deployment of Elementary this depends on the particular subset of sources we run, so I suspect the model name length and other metadata is occasionally going over the limit.
To Reproduce
The nature of this error means it's will only be reproduced when the source arguments contains a lot of info. I can't share an example, but this could probably be simulated.
Expected behavior
It should chunk up the commands into sizeable batches.
Environment (please complete the following information):
- edr Version: 0.13.2
- dbt package Version: 0.13.0
Additional context
I believe the underlying cause is that Elementary currently chunks up the batches based on the number of records, into chunks of 100 records, regardless of record length.
https://github.com/elementary-data/elementary/blob/3220d4b1f87ff52855b1f5d1ae1171cf652f603e/elementary/operations/upload_source_freshness.py#L61
I've got a patch which fixes this by calculating the character length and chunks using that. I will upload a PR shortly.
We are running into the same thing - except we have it every single time and it blocks us from using this functionality which is our main reason for choosing elementary!
Hi all, closing since now the default chunk size is lower and can be controlled with the rows_per_insert
parameter.
The approach in the PR above is probably better but needs some adjustments (commented in more detail in the PR) - so feel free to re-open if you still feel it's needed.