sbarnoud comments

Results 25 comments of


                                            sbarnoud

Object Storage - High Performance

Any news on that topic? It make now a while that we expect a solution, and nothing seems to be announced ...

[Need help] Join HBase dataframe with a Structured Stream

Hi, In my opinion you should replace .select('id', col('hbase_version').cast('integer')), by .selectExpr('id','cast(hbase_version as integer) as hbase_version') or use .alias(). Did you try with the latest HBaseSync from https://github.com/hortonworks-spark/shc/pull/238? I'm wondering .option('hbasecat',...

Dataframe integer columns are not loading properly

Every things seems correct. Just specify a formatter to your hbase scan to print integer value as integer and not as bytes.

Add a unit test using a dataset with a field order different than in …

This test willingly fails to show that Avro serialization depends on field order in the dataset, which in my opinion is not the expexted bahavior. I propose in https://github.com/hortonworks-spark/shc/pull/248 a...

Add a unit test using a dataset with a field order different than in …

May i have a feedback? Travis failed, but this is normal, the test is just intended for that, and a PR propose a fix.

Custom sink provider for structured streaming

You forget that SHC suppoorts Avro schema . The user should be able to pass any key in options to define them.

Custom sink provider for structured streaming

I propose for options to use: ``` class HBaseStreamSink(sqlContext: SQLContext,options: Map[String, String]) extends Sink { val defaultFormat = "org.apache.spark.sql.execution.datasources.hbase" val specifiedHBaseParams = options .keySet .filter(_.toLowerCase(Locale.ROOT).startsWith("hbase.")) .map { k => k.drop(6).toString...

Custom sink provider for structured streaming

Then, -) the sink doesn't updates Spark counters ... it should be done somewhere -) the sink short name is not registered And finally avro type is not working because...

Custom sink provider for structured streaming

I just declare the Avro schema in the alphabetical order of field names: and it works. No idea where is the problem, but it has to be corrected. ``` {"namespace":...

Custom sink provider for structured streaming

Hi, For counters: ``` @Override public void onQueryProgress(QueryProgressEvent event) { log.info("QueryProgressEvent event :"+event.progress().numInputRows()); } ``` QueryProgressEvent contains some counters like numInputRows ... that are not updated. For Avro, I found...