easyexcel
easyexcel copied to clipboard
读取大文件耗时很久
建议先去看文档
异常代码
EasyExcel.read(file, new ReadListener() {
@Override
public void invoke(Object data, AnalysisContext context) {
System.out.println(data);
throw new ExcelAnalysisStopException();
}
@Override
public void doAfterAllAnalysed(AnalysisContext context) {
}
}).sheet().doRead();
异常提示
问题描述
就这段代码,读取一个160MB的xlsx,100W行、48列,耗时40秒。我只想读取第一行然后就结束读取,这耗时让我怀疑我是不是用错方式了
我用以下方法来读取就很快,2~4秒就能出结果,能说说差异在哪吗?
public static Map<String, String> firstMap(File file) {
OPCPackage pkg = null;
InputStream sheet = null;
final Map<String, String> titleMap = new HashMap<>();
try {
pkg = OPCPackage.open(file, PackageAccess.READ);
XSSFReader reader = new XSSFReader(pkg);
sheet = reader.getSheet("rId1");
XMLReader parser = XMLReaderFactory.createXMLReader();
ReadOnlySharedStringsTable rosst = new ReadOnlySharedStringsTable(pkg);
parser.setContentHandler(new XSSFSheetXMLHandler(reader.getStylesTable(),rosst,new XSSFSheetXMLHandler.SheetContentsHandler() {
@Override
public void cell(String index, String value, XSSFComment comment) {
titleMap.put(index, value);
}
@Override
public void startRow(int arg0) {
}
@Override
public void endRow(int arg0) {
throw new ArithmeticException();
}
},
false));
try {
parser.parse(new InputSource(sheet));
} catch(ArithmeticException e) {
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
sheet.close();
} catch (IOException e) {
}
try {
pkg.close();
} catch (IOException e) {
}
return titleMap;
}
}
因为他源码里在读取的时候在一开始就解析了完整的sharestring,你的代码是自己处理了sharestring
可以尝试换成EEC来处理这类问题, https://github.com/wangguanquan/eec 它支持filter, map, collect等标准的流式处理,可以像操作集合类一样读取Excel
try (ExcelReader reader = ExcelReader.read(Paths.get("/export/excel-1000k.xlsx"))) {
// 指定第1行为表头并获取表头
HeaderRow row = (HeaderRow) reader.sheet(0).header(1).getHeader();
System.out.println(row);
// 获取第100行的数据
Row row100 = reader.sheet(0).rows().filter(r -> r.getRowNum() == 100).findAny().get();
System.out.println(row100);
// 将行数据转换为指定对象
Item item = row100.to(Item.class);
}
感谢各位,我去研究研究