tpcds-kit
tpcds-kit copied to clipboard
Re-enable printing to stdout
It is more complicated than this to get this working correctly. The challenge is that the sales/returns tables are generated in pairs and there is no way to generate only one of them. I've recently been using https://github.com/teradata/tpcds because it is much faster than the TPC version written in C (surprising, I know).
Yes, I resorted to the rather ugly approach of relying on the fact that child tables all have a different number of fields for now, which is just nasty. Going straight from dsdgen to Parquet using Spark which takes away the requirement for passwordless SSH for distribution etc, but I will definitely check out the link, thanks.