sbarnoud

Results 25 comments of sbarnoud

Any news on that topic? It make now a while that we expect a solution, and nothing seems to be announced ...

Hi, In my opinion you should replace .select('id', col('hbase_version').cast('integer')), by .selectExpr('id','cast(hbase_version as integer) as hbase_version') or use .alias(). Did you try with the latest HBaseSync from https://github.com/hortonworks-spark/shc/pull/238? I'm wondering .option('hbasecat',...

Every things seems correct. Just specify a formatter to your hbase scan to print integer value as integer and not as bytes.

This test willingly fails to show that Avro serialization depends on field order in the dataset, which in my opinion is not the expexted bahavior. I propose in https://github.com/hortonworks-spark/shc/pull/248 a...

May i have a feedback? Travis failed, but this is normal, the test is just intended for that, and a PR propose a fix.

You forget that SHC suppoorts Avro schema . The user should be able to pass any key in options to define them.

I propose for options to use: ``` class HBaseStreamSink(sqlContext: SQLContext,options: Map[String, String]) extends Sink { val defaultFormat = "org.apache.spark.sql.execution.datasources.hbase" val specifiedHBaseParams = options .keySet .filter(_.toLowerCase(Locale.ROOT).startsWith("hbase.")) .map { k => k.drop(6).toString...

Then, -) the sink doesn't updates Spark counters ... it should be done somewhere -) the sink short name is not registered And finally avro type is not working because...

I just declare the Avro schema in the alphabetical order of field names: and it works. No idea where is the problem, but it has to be corrected. ``` {"namespace":...

Hi, For counters: ``` @Override public void onQueryProgress(QueryProgressEvent event) { log.info("QueryProgressEvent event :"+event.progress().numInputRows()); } ``` QueryProgressEvent contains some counters like numInputRows ... that are not updated. For Avro, I found...