shc
shc copied to clipboard
Is SHC REALLY production ready?
Hi,
Looking at the trivial issues posted here, and digging in the source code, it doesn't seem that SHC is production ready at all. It was really surprising, for example, to find out that shc treats 'greater-of-equals' the same as 'greater'. This critical behavior is not documented anywhere except for some comment deep in the code. This could have lead us to serious production issues if deployed like this... SHC seems to be a very good solution to bring spark SQL to hbase, and we'd really like to integrate it in our system, but as we dig deeper we get the impression there is more substantial work to put into it before we can deploy it.
I'd really like to hear your opinion about this. Thanks a lot.
Hi @shay1bz
SHC has already been used in production. Do you mean here treating 'greater-of-equals' the same as 'greater'? Are the serious production issues it would bring to you correctness issues or performance issues?
@weiqingy thank you for your response.
Yes I am talking about the above design which would lead us to correctness issues.
I am having issue with the SHC Jar, with below error - java.lang.UnsupportedOperationException: PrimitiveType coder: unsupported data type null
Could you please suggest, is there any issue with the Jar versions (com.hortonworks:shc-core:1.1.1-2.1-s_2.11)
@kpspark I guess we need to handle nulls in the dataframe before writing it to the HBase table.
@spenumala This PR has handled "null": https://github.com/hortonworks-spark/shc/commit/7fa435994e8233f3891e8675a301cd6101ff10ed
@weiqingy I'm currently using com.hortonworks:shc-core:1.1.1-2.1-s_2.11 version and i'm still facing the null issue! any pointers?