Liran

Results 46 comments of Liran

We don't have such a feature, but it's a cool idea to create fake micro batches in metorikku to stream on non streaming sources

Can you share your environment details? (yarn/k8s/spark-shell?) which hive you are using? what version of spark? Thanks

So hive is configured correctly in the spark-defaults etc? If so all you need to save in hive is set tableName in your output https://github.com/YotpoLtd/metorikku/blob/7fdd0838480e89285e5c7b77cc3e45d54d6bf5a0/config/metric_config_sample.yaml#L50

So your input is: ``` input1: s3://mybucket/a/b/c/d/e/col1=x/col2=y/some1.csv,s3://mybucket/a/b/c/d/e/col1=a/col2=b/some2.csv,... ``` ? And then in your select you're getting can't find col1 right? Without the schema file it works? Can you share some...

I guess I haven't tested with spark 2.3 in a while. Let me see if I can get a fix working for spark 2.3 .

@Irenez753 what's the status of this PR?

Hi @tooptoop4 sorry for the late response. I actually have no idea how you read in spark a hudi table without hive... I think that's the only way for now,...

What's the error you're getting? You may need to do something like: **/*.parquet in the path

I never tried this but if you're in EMR and glue is connected it will simply work...