pdf_translator
pdf_translator copied to clipboard
translate PDF files using GPT
PDF Translator
This repository offers an WebUI and API endpoint that translates PDF files using openai GPT, preserving the original layout.
Features
-
translate PDF files while preserving layout
-
translation engines:
- google translate (default)
- openAI (best)
-
layout recognition engines:
- UniLM DiT
-
OCR engines:
- PaddleOCR
-
Font recognition engines:
- /
Installation
- Clone this repository
git clone https://github.com/ppisljar/pdf_translator.git
cd pdf_translator
- Edit config.yaml and enter openai api key change type to 'openai' and enter your key under openai_api_key if this is not changed translation engine will default to google translate
docker installation
- Build the docker image via Makefile
make build
- Run the docker container via Makefile
make run
venv installation
- create venv and activte
prerequesites:
- ffmpeg, ... possibly more, check Dockerfile if you are running into issues
python3 -m venv .
source bin/activate
- install requirements
pip3 install -r requirements.txt
pip3 install "git+https://github.com/facebookresearch/detectron2.git"
- get models
make get_models
- run
python3 server.py
GUI Usage
Access to GUI via browser.
http://localhost:8765
Requirements
- NVIDIA GPU (currently only support NVIDIA GPU)
- Docker
License
This repository does not allow commercial use.
This repository is licensed under CC BY-NC 4.0. See LICENSE for more information.
TODOs
- [ ] Make possible to highlight the translated text
- [ ] Support M1 Mac or CPU
- [ ] switch to VGT for layout detection
- [ ] add font detection (family/style/color/size/alignment)
- [ ] add support for translating lists
- [ ] add support for translating tables
- [ ] add support for translating text within images
References
-
based on https://github.com/discus0434/pdf-translator
-
For PDF layout analysis, using DiT.
-
For PDF to text conversion, using PaddlePaddle model.