Markus Klein
Markus Klein
Hello @vikramhn , feel free to evaluate and choose any tool you like for your usecase. This `odbc2parquet` is only concerned with ODBC Parquet. As such I might be interessted...
Closing this issue, I hope this does not discourage anyone from sharing benchmarks, or interssting findings
Hello, I can look at this later, but your report has been very helpful so far. May I draw your attention to the `--batch_size` parameter. Its default value is 100000...
I think you are far from stupid, and I am happy you raised the issue. I do not know how much I can do to make this work out of...
How would you feel about specifying the `--batch-size` in Memory rather than number of rows?
So far my strategy for handling these large columns and memory allocations in general is: * Calculate and log the amount of memory required per row * Make batch_size configurable...
Newest version allows for specifying desired memory usage. Defaults to 2 GiB on 64 Bit platforms. There is still more that could be done. Both in terms of either streaming...
Hi, thanks for the great user story! I already try to map timestamps as best as I can. I fully empathise with your usecase. You wouldn't happen to know to...
I see, this is why the timestamp logic of `odbc2parquet` does not trigger. It is just considered an "other" type, and fetched as string. One way of tackling this could...
If you could find out what the struct for PostgreSQL would look like, we may be able to support that, too.