Liran comments

Results 46 comments of


Liran

how to execute metorikku in a loop from a single spark-submit?

We don't have such a feature, but it's a cool idea to create fake micro batches in metorikku to stream on non streaming sources

How we can read and write the data to Hive?

Can you share your environment details? (yarn/k8s/spark-shell?) which hive you are using? what version of spark? Thanks

How we can read and write the data to Hive?

So hive is configured correctly in the spark-defaults etc? If so all you need to save in hive is set tableName in your output https://github.com/YotpoLtd/metorikku/blob/7fdd0838480e89285e5c7b77cc3e45d54d6bf5a0/config/metric_config_sample.yaml#L50

how to use schemaJson and get partition columns from s3?

So your input is: ``` input1: s3://mybucket/a/b/c/d/e/col1=x/col2=y/some1.csv,s3://mybucket/a/b/c/d/e/col1=a/col2=b/some2.csv,... ``` ? And then in your select you're getting can't find col1 right? Without the schema file it works? Can you share some...

how to use schemaJson and get partition columns from s3?

Can you attach your schema file?

Do all the features of metorikku(v0.0.98）work on spark 2.3.2?

I guess I haven't tested with spark 2.3 in a while. Let me see if I can get a fix working for spark 2.3 .

feat(InstrumentationOutputWriter): handle multiple fields

@Irenez753 what's the status of this PR?

how to read from a hudi input?

Hi @tooptoop4 sorry for the late response. I actually have no idea how you read in spark a hudi table without hive... I think that's the only way for now,...

how to read from a hudi input?

What's the error you're getting? You may need to do something like: **/*.parquet in the path

support for reading tables from Glue metastore

I never tried this but if you're in EMR and glue is connected it will simply work...