azure-tables-hadoop Performance against large Table Storage collection

Performance against large Table Storage collection

Open claytonrothschild opened this issue 10 years ago • 1 comments

We are having trouble getting this library to perform against a Table Storage collection that has about 2 million records in it. Each record is approximately 4KB.

For example, a simple SELECT LIMIT 10 statement is timing out on a 7 node HDInsight cluster. Has anyone tried using this library yet, and if so, are you having similar results? Perhaps we are not using it properly.

Thanks and regards, Clayton

Feb 19 '15 22:02 claytonrothschild

I think the performance issue relative to get all partition in the DefaultTablePartitioner.java. maybe you should rewrite the code depend on the partition logic of your own table.

Jan 25 '16 22:01 caserzer

azure-tables-hadoop azure-tables-hadoop copied to clipboard

Performance against large Table Storage collection

azure-tables-hadoop
azure-tables-hadoop copied to clipboard