tools-ocr icon indicating copy to clipboard operation
tools-ocr copied to clipboard

linux编译运行jar包崩溃

Open nuclear06 opened this issue 2 years ago • 2 comments

  • OS info

❯ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.3 LTS Release: 22.04 Codename: jammy

  • JDK

❯ java -version java version "1.8.0_351" Java(TM) SE Runtime Environment (build 1.8.0_351-b10) Java HotSpot(TM) 64-Bit Server VM (build 25.351-b10, mixed mode)

  • 流程

    1. 使用README提供的命令进行构建
    mkdir target\jfx\app
    cp -r models target\jfx\app
    mvn jfx:native -DskipTests -f pom.xml
    
    1. 编译完成后目录如下(省略部分路径)
    .
    ├── app.properties
    ├── docs
    ├── LICENSE
    ├── models
    ├── package.json
    ├── pom.xml
    ├── src
    ├── target
    │   ├── jfx
    │   │   ├── app
    │   │   └── native
    │   └── tools-ocr-2.2.9.jar
    └── targetjfxapp
    
    1. 进入路径target/jfx/native/treehole/app/
    cd target/jfx/native/treehole/app/
    
    # 使用提供的命令下载模型
    wget https://github.com/litongjava/tools-ocr/releases/download/model-ppocr-v4/ch_PP-OCRv4_det_infer-onnx.zip
    wget https://github.com/litongjava/tools-ocr/releases/download/model-ppocr-v4/ch_PP-OCRv4_rec_infer-onnx.zip
    mkdir -p models/ch_PP-OCRv4_det_infer
    mkdir -p models/ch_PP-OCRv4_rec_infer
    unzip ch_PP-OCRv4_det_infer-onnx.zip -d models/ch_PP-OCRv4_det_infer
    unzip ch_PP-OCRv4_rec_infer-onnx.zip -d models/ch_PP-OCRv4_rec_infer
    
    1. 当前的目录结构
    ❯ tree -L 2
    .
    ├── lib
    │   ├── api-0.25.0.jar
    │   ├── basicdataset-0.25.0.jar
    │   ├── commons-compress-1.23.0.jar
    │   ├── commons-csv-1.10.0.jar
    │   ├── commons-logging-1.2.jar
    │   ├── fontbox-2.0.24.jar
    │   ├── gson-2.10.1.jar
    │   ├── hutool-all-5.8.11.jar
    │   ├── imgscalr-lib-4.2.jar
    │   ├── jna-5.13.0.jar
    │   ├── jnativehook-2.1.0.jar
    │   ├── logback-classic-1.2.3.jar
    │   ├── logback-core-1.2.3.jar
    │   ├── model-zoo-0.25.0.jar
    │   ├── onnxruntime-1.16.0.jar
    │   ├── onnxruntime-engine-0.25.0.jar
    │   ├── opencv-0.25.0.jar
    │   ├── opencv-4.7.0-0.jar
    │   ├── pdfbox-2.0.24.jar
    │   ├── pytorch-engine-0.25.0.jar
    │   └── slf4j-api-1.7.25.jar
    ├── models
    │   ├── ch_PP-OCRv4_det_infer
    │   └── ch_PP-OCRv4_rec_infer
    ├── tools-ocr-2.2.9-jfx.jar
    └── treehole.cfg
    
  1. 运行tools-ocr-2.2.9-jfx.jar
Java运行输出

❯ java -jar tools-ocr-2.2.9-jfx.jar

2024-01-03 00:24:03.259 [JavaFX-Launcher] WARN SimpleRepository.getMetadata:212 - Simple repository pointing to a non-archive file. Loading: 100% |████████████████████████████████████████| 2024-01-03 00:24:03.480 [JavaFX-Launcher] WARN LibUtils.downloadPyTorch:444 - No matching cuda flavor for lin ux-x86_64 found: cu117. 2024-01-03 00:24:03.817 [JavaFX-Launcher] INFO PtEngine.newInstance:67 - PyTorch graph executor optimizer is enabled, this may impact your inference latency and throughput. See: https://docs.djl.ai/docs/development/infe rence_performance_optimization.html#graph-executor-optimization 2024-01-03 00:24:03.820 [JavaFX-Launcher] INFO PtEngine.newInstance:72 - Number of inter-op threads is 8 2024-01-03 00:24:03.820 [JavaFX-Launcher] INFO PtEngine.newInstance:73 - Number of intra-op threads is 8 2024-01-03 00:24:03.938 [JavaFX-Launcher] WARN SimpleRepository.getMetadata:212 - Simple repository pointing to a non-archive file. Loading: 100% |████████████████████████████████████████| 2024-01-03 00:24:04.076 [JavaFX Application Thread] INFO MainForm.init:58 - primaryStage:javafx.stage.Stage@1 c648f55 # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007fa4b4897ef4, pid=474539, tid=0x00007fa47fb20640 # # JRE version: Java(TM) SE Runtime Environment (8.0_351-b10) (build 1.8.0_351-b10) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.351-b10 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libc.so.6+0x97ef4] pthread_mutex_lock+0x4 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /home/saniter/my_project/tools-ocr/target/jfx/native/treehole/app/hs_err_pid474539.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # [1] 474539 IOT instruction (core dumped) java -jar tools-ocr-2.2.9-jfx.jar

  1. 日志hs_err_pid474539.log 内容 logs

我的操作有哪一步存在问题吗?任何指导都非常感谢!

nuclear06 avatar Jan 02 '24 16:01 nuclear06

1.请使用最版本测试一下 2.使用这个文件中的docker镜像 https://github.com/AnyListen/tools-ocr/blob/master/.github/workflows/build.yml

litongjava avatar Apr 28 '24 13:04 litongjava

TL;DR: 已成功解决,为上游代码问题,希望作者更新代码。此外,在使用中发现一个截图偏移的问题。

存在的问题

  1. 新的代码中使用了模型ONNX_PPOCR_V4_SERVER,但是上游项目RapidOcr-Java回滚了该更新 https://github.com/MyMonsterCat/RapidOcr-Java/issues/46 基于该issue重新编译rapidocr-onnx-models-1.2.2.jarrapidocr-common-0.0.7.jar ps: 手动编译后不需要按照build.yml中下载模型文件到指定路径

  2. 在手动编译jar包后依旧出现core dump 怀疑为上游jnativehook的问题 https://github.com/kwhat/jnativehook/issues/442 怀疑为版本问题,尝试使用不同版本 在更换版本为2.0.2时可以正常运行,并且,在全部共4个版本中有2.0.2版本编译出的可以正常运行

新发现的问题

在使用截图ocr时截图的框和实际内容不匹配,存在垂直下移的问题,猜测为全局缩放率导致的, a b

nuclear06 avatar May 06 '24 12:05 nuclear06