myexcel
myexcel copied to clipboard
针对大数据Hbase的导出,动态列的问题怎样解决?
描述 我在做一个Hbase数据的导出,量级在百万级别,要求能自动分Sheet,但是遇到了麻烦是,Hbase的不同行的列也是不同的,Hbase是列式存储,每行的列可能会不一样,比如第一行有ABC三列,第二行有AF两列,所以在导出的时候,遇到了titles的问题。
复现例子
List<String> bt = Arrays.asList("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K");
List<Map> getRows(int page, int size) {//模拟Hbase不固定列的数据
List<Map> rows = new ArrayList<>();
for (int i = 0; i < size; i++) {
Map<String, Object> m = new HashMap<>();
long k = 0;
for (int j = 0; j < page; j++) {
int t = j;
if (t > 10) {
t = 10;
}
String key = bt.get(t);
String value = page + "_" + key + "_" + (i + 1) + "_" + (++k);
m.put(key, value);
}
rows.add(m);
}
return rows;
}
@Test
void t001() throws IOException {
List<String> titles = new ArrayList<>();
DefaultStreamExcelBuilder<Map> streamExcelBuilder = DefaultStreamExcelBuilder.of(Map.class);
streamExcelBuilder.noStyle();
streamExcelBuilder.capacity(10000);
streamExcelBuilder.titles(titles);
streamExcelBuilder.start();
for (int i = 0; i < 10; i++) {
List<Map> rows = getRows(i, 10);
for (Map row : rows) {
for (Object key : row.keySet()) {
if (!titles.contains(key.toString())) {
titles.add(key.toString());//将每一行返回的数据修改表头
}
}
}
streamExcelBuilder.append(rows);
}
Workbook workbook = streamExcelBuilder.build();
FileExportUtil.export(workbook, new File("d:/tmp/1.xlsx"));
streamExcelBuilder.close();
}
期望的结果 期望导出成功,表头正确,表头可以是所有行的列集合去重