BallonsTranslator icon indicating copy to clipboard operation
BallonsTranslator copied to clipboard

请问可以导出ocr的漫画原文吗

Open longfei796 opened this issue 1 year ago • 17 comments

导出的只有翻译好的原文

longfei796 avatar Sep 30 '23 08:09 longfei796

软件自身不可以但是可以通过脚本 具体看这里 https://github.com/dmMaze/BallonsTranslator/issues/197

lhj5426 avatar Sep 30 '23 10:09 lhj5426

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

dmMaze avatar Oct 03 '23 14:10 dmMaze

你是指word文档吗, 那个我可以加

如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮最好是导出 excel表格哈哈 第一列原文 第二列ocr的文本

lhj5426 avatar Oct 03 '23 15:10 lhj5426

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

If I haven't screwed up anywhere, here goes:

import json
import os

image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.tiff']

def clean_path(path):
    return path.strip('"')

def is_image_file(filename):
    _, extension = os.path.splitext(filename)
    return extension.lower() in image_extensions

def parse_translation_file_without_original(translation_file_content):
    translations = {}
    current_page = None
    for line in translation_file_content.splitlines():
        line = line.strip()
        if is_image_file(line):
            current_page = line
            translations[current_page] = []
            continue
        if current_page and ". " in line:
            _, translation_text = line.split(". ", 1)
            translations[current_page].append(translation_text)
    return translations

def export_text_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["text"][0] if item["text"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Text exported successfully to {output_file_path}"

def update_json_with_translation_without_original(json_file_path, translations):
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    for page, page_translations in translations.items():
        if page in data["pages"]:
            for idx, text_data in enumerate(data["pages"][page]):
                if idx < len(page_translations):
                    text_data["translation"] = page_translations[idx]
                    # Clearing the "rich_text"
                    text_data["rich_text"] = ""
    with open(json_file_path, 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    return "JSON file updated successfully."

def export_translation_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["translation"] if "translation" in item and item["translation"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_translation_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Translated text exported successfully to {output_file_path}"


def main():
    import sys
    operation = sys.argv[1]
    if operation == "export":
        json_path = sys.argv[2]
        print(export_text_from_json(json_path))
    elif operation == "export_translation":
        json_path = sys.argv[2]
        print(export_translation_from_json(json_path))
    elif operation == "import":
        translation_path = sys.argv[2]
        json_path = sys.argv[3]
        with open(translation_path, 'r', encoding='utf-8') as f:
            translation_content = f.read()
        translations = parse_translation_file_without_original(translation_content)
        print(update_json_with_translation_without_original(json_path, translations))
    else:
        print(f"Unknown operation: {{operation}}")


if __name__ == "__main__":
    main()
@echo off
cd %~dp0

:: Set the path to the Python script
set script_path=%~dp0extract_texts.py

:: Check the number of arguments
if "%~1"=="" goto NoArguments
if "%~2"=="" goto SingleFile
if not "%~3"=="" goto end

:: Handle case with two files
set first_ext=%~x1
set second_ext=%~x2

if "%first_ext%"==".json" (
    if "%second_ext%"==".txt" (
        python "%script_path%" import "%~2" "%~1"
    )
) else if "%first_ext%"==".txt" (
    if "%second_ext%"==".json" (
        python "%script_path%" import "%~1" "%~2"
    )
)
goto end

:: Handle case with a single JSON file
:SingleFile
if "%~x1"==".json" (
    echo.
    echo Choose an export option:
    echo [1] Export Text
    echo [2] Export Translation
    echo.
    set /p choice="Enter your choice (1/2): "
)

if "%choice%"=="1" goto ExportSingle
if "%choice%"=="2" goto ExportTranslationSingle
goto end

:ExportSingle
echo python "%script_path%" export "%~1"
python "%script_path%" export "%~1"
goto end

:ExportTranslationSingle
echo python "%script_path%" export_translation "%~1"
python "%script_path%" export_translation "%~1"
goto end

:: Display choice menu when no arguments are provided
:NoArguments
echo.
echo Choose an option:
echo [1] Export Text
echo [2] Export Translation
echo [3] Import Text
echo.

set /p choice="Enter your choice (1/2/3): "
if "%choice%"=="1" goto ManualExport
if "%choice%"=="2" goto ManualExportTranslation
if "%choice%"=="3" goto ManualImport
goto end

:ManualExport
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export %json_path%
goto end

:ManualExportTranslation
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export_translation %json_path%
goto end

:ManualImport
set /p translation_path="Enter path to the translation text file: "
set /p json_path="Enter path to the JSON file: "
python "%script_path%" import %translation_path% %json_path%
goto end

:end
pause

bropines avatar Oct 03 '23 20:10 bropines

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

If I haven't screwed up anywhere, here goes:

import json
import os

image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.tiff']

def clean_path(path):
    return path.strip('"')

def is_image_file(filename):
    _, extension = os.path.splitext(filename)
    return extension.lower() in image_extensions

def parse_translation_file_without_original(translation_file_content):
    translations = {}
    current_page = None
    for line in translation_file_content.splitlines():
        line = line.strip()
        if is_image_file(line):
            current_page = line
            translations[current_page] = []
            continue
        if current_page and ". " in line:
            _, translation_text = line.split(". ", 1)
            translations[current_page].append(translation_text)
    return translations

def export_text_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["text"][0] if item["text"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Text exported successfully to {output_file_path}"

def update_json_with_translation_without_original(json_file_path, translations):
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    for page, page_translations in translations.items():
        if page in data["pages"]:
            for idx, text_data in enumerate(data["pages"][page]):
                if idx < len(page_translations):
                    text_data["translation"] = page_translations[idx]
                    # Clearing the "rich_text"
                    text_data["rich_text"] = ""
    with open(json_file_path, 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    return "JSON file updated successfully."

def export_translation_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["translation"] if "translation" in item and item["translation"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_translation_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Translated text exported successfully to {output_file_path}"


def main():
    import sys
    operation = sys.argv[1]
    if operation == "export":
        json_path = sys.argv[2]
        print(export_text_from_json(json_path))
    elif operation == "export_translation":
        json_path = sys.argv[2]
        print(export_translation_from_json(json_path))
    elif operation == "import":
        translation_path = sys.argv[2]
        json_path = sys.argv[3]
        with open(translation_path, 'r', encoding='utf-8') as f:
            translation_content = f.read()
        translations = parse_translation_file_without_original(translation_content)
        print(update_json_with_translation_without_original(json_path, translations))
    else:
        print(f"Unknown operation: {{operation}}")


if __name__ == "__main__":
    main()
@echo off
cd %~dp0

:: Set the path to the Python script
set script_path=%~dp0extract_texts.py

:: Check the number of arguments
if "%~1"=="" goto NoArguments
if "%~2"=="" goto SingleFile
if not "%~3"=="" goto end

:: Handle case with two files
set first_ext=%~x1
set second_ext=%~x2

if "%first_ext%"==".json" (
    if "%second_ext%"==".txt" (
        python "%script_path%" import "%~2" "%~1"
    )
) else if "%first_ext%"==".txt" (
    if "%second_ext%"==".json" (
        python "%script_path%" import "%~1" "%~2"
    )
)
goto end

:: Handle case with a single JSON file
:SingleFile
if "%~x1"==".json" (
    echo.
    echo Choose an export option:
    echo [1] Export Text
    echo [2] Export Translation
    echo.
    set /p choice="Enter your choice (1/2): "
)

if "%choice%"=="1" goto ExportSingle
if "%choice%"=="2" goto ExportTranslationSingle
goto end

:ExportSingle
echo python "%script_path%" export "%~1"
python "%script_path%" export "%~1"
goto end

:ExportTranslationSingle
echo python "%script_path%" export_translation "%~1"
python "%script_path%" export_translation "%~1"
goto end

:: Display choice menu when no arguments are provided
:NoArguments
echo.
echo Choose an option:
echo [1] Export Text
echo [2] Export Translation
echo [3] Import Text
echo.

set /p choice="Enter your choice (1/2/3): "
if "%choice%"=="1" goto ManualExport
if "%choice%"=="2" goto ManualExportTranslation
if "%choice%"=="3" goto ManualImport
goto end

:ManualExport
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export %json_path%
goto end

:ManualExportTranslation
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export_translation %json_path%
goto end

:ManualImport
set /p translation_path="Enter path to the translation text file: "
set /p json_path="Enter path to the JSON file: "
python "%script_path%" import %translation_path% %json_path%
goto end

:end
pause

"How do I use the script you wrote? You didn't provide any instructions. Can you explain how to use it?"

lhj5426 avatar Oct 04 '23 02:10 lhj5426

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

If I haven't screwed up anywhere, here goes:

import json
import os

image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.tiff']

def clean_path(path):
    return path.strip('"')

def is_image_file(filename):
    _, extension = os.path.splitext(filename)
    return extension.lower() in image_extensions

def parse_translation_file_without_original(translation_file_content):
    translations = {}
    current_page = None
    for line in translation_file_content.splitlines():
        line = line.strip()
        if is_image_file(line):
            current_page = line
            translations[current_page] = []
            continue
        if current_page and ". " in line:
            _, translation_text = line.split(". ", 1)
            translations[current_page].append(translation_text)
    return translations

def export_text_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["text"][0] if item["text"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Text exported successfully to {output_file_path}"

def update_json_with_translation_without_original(json_file_path, translations):
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    for page, page_translations in translations.items():
        if page in data["pages"]:
            for idx, text_data in enumerate(data["pages"][page]):
                if idx < len(page_translations):
                    text_data["translation"] = page_translations[idx]
                    # Clearing the "rich_text"
                    text_data["rich_text"] = ""
    with open(json_file_path, 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    return "JSON file updated successfully."

def export_translation_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["translation"] if "translation" in item and item["translation"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_translation_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Translated text exported successfully to {output_file_path}"


def main():
    import sys
    operation = sys.argv[1]
    if operation == "export":
        json_path = sys.argv[2]
        print(export_text_from_json(json_path))
    elif operation == "export_translation":
        json_path = sys.argv[2]
        print(export_translation_from_json(json_path))
    elif operation == "import":
        translation_path = sys.argv[2]
        json_path = sys.argv[3]
        with open(translation_path, 'r', encoding='utf-8') as f:
            translation_content = f.read()
        translations = parse_translation_file_without_original(translation_content)
        print(update_json_with_translation_without_original(json_path, translations))
    else:
        print(f"Unknown operation: {{operation}}")


if __name__ == "__main__":
    main()
@echo off
cd %~dp0

:: Set the path to the Python script
set script_path=%~dp0extract_texts.py

:: Check the number of arguments
if "%~1"=="" goto NoArguments
if "%~2"=="" goto SingleFile
if not "%~3"=="" goto end

:: Handle case with two files
set first_ext=%~x1
set second_ext=%~x2

if "%first_ext%"==".json" (
    if "%second_ext%"==".txt" (
        python "%script_path%" import "%~2" "%~1"
    )
) else if "%first_ext%"==".txt" (
    if "%second_ext%"==".json" (
        python "%script_path%" import "%~1" "%~2"
    )
)
goto end

:: Handle case with a single JSON file
:SingleFile
if "%~x1"==".json" (
    echo.
    echo Choose an export option:
    echo [1] Export Text
    echo [2] Export Translation
    echo.
    set /p choice="Enter your choice (1/2): "
)

if "%choice%"=="1" goto ExportSingle
if "%choice%"=="2" goto ExportTranslationSingle
goto end

:ExportSingle
echo python "%script_path%" export "%~1"
python "%script_path%" export "%~1"
goto end

:ExportTranslationSingle
echo python "%script_path%" export_translation "%~1"
python "%script_path%" export_translation "%~1"
goto end

:: Display choice menu when no arguments are provided
:NoArguments
echo.
echo Choose an option:
echo [1] Export Text
echo [2] Export Translation
echo [3] Import Text
echo.

set /p choice="Enter your choice (1/2/3): "
if "%choice%"=="1" goto ManualExport
if "%choice%"=="2" goto ManualExportTranslation
if "%choice%"=="3" goto ManualImport
goto end

:ManualExport
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export %json_path%
goto end

:ManualExportTranslation
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export_translation %json_path%
goto end

:ManualImport
set /p translation_path="Enter path to the translation text file: "
set /p json_path="Enter path to the JSON file: "
python "%script_path%" import %translation_path% %json_path%
goto end

:end
pause

"How do I use the script you wrote? You didn't provide any instructions. Can you explain how to use it?"

we put the bat file and the py file in the same folder. run the bat file, select the desired parameter and pass the necessary arguments. Or you can drag the json file to the bat script and it will offer export. When transferring a PDF and a text document, it imports the translation from a text file

bropines avatar Oct 04 '23 07:10 bropines

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

If I haven't screwed up anywhere, here goes:

import json
import os

image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.tiff']

def clean_path(path):
    return path.strip('"')

def is_image_file(filename):
    _, extension = os.path.splitext(filename)
    return extension.lower() in image_extensions

def parse_translation_file_without_original(translation_file_content):
    translations = {}
    current_page = None
    for line in translation_file_content.splitlines():
        line = line.strip()
        if is_image_file(line):
            current_page = line
            translations[current_page] = []
            continue
        if current_page and ". " in line:
            _, translation_text = line.split(". ", 1)
            translations[current_page].append(translation_text)
    return translations

def export_text_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["text"][0] if item["text"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Text exported successfully to {output_file_path}"

def update_json_with_translation_without_original(json_file_path, translations):
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    for page, page_translations in translations.items():
        if page in data["pages"]:
            for idx, text_data in enumerate(data["pages"][page]):
                if idx < len(page_translations):
                    text_data["translation"] = page_translations[idx]
                    # Clearing the "rich_text"
                    text_data["rich_text"] = ""
    with open(json_file_path, 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    return "JSON file updated successfully."

def export_translation_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["translation"] if "translation" in item and item["translation"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_translation_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Translated text exported successfully to {output_file_path}"


def main():
    import sys
    operation = sys.argv[1]
    if operation == "export":
        json_path = sys.argv[2]
        print(export_text_from_json(json_path))
    elif operation == "export_translation":
        json_path = sys.argv[2]
        print(export_translation_from_json(json_path))
    elif operation == "import":
        translation_path = sys.argv[2]
        json_path = sys.argv[3]
        with open(translation_path, 'r', encoding='utf-8') as f:
            translation_content = f.read()
        translations = parse_translation_file_without_original(translation_content)
        print(update_json_with_translation_without_original(json_path, translations))
    else:
        print(f"Unknown operation: {{operation}}")


if __name__ == "__main__":
    main()
@echo off
cd %~dp0

:: Set the path to the Python script
set script_path=%~dp0extract_texts.py

:: Check the number of arguments
if "%~1"=="" goto NoArguments
if "%~2"=="" goto SingleFile
if not "%~3"=="" goto end

:: Handle case with two files
set first_ext=%~x1
set second_ext=%~x2

if "%first_ext%"==".json" (
    if "%second_ext%"==".txt" (
        python "%script_path%" import "%~2" "%~1"
    )
) else if "%first_ext%"==".txt" (
    if "%second_ext%"==".json" (
        python "%script_path%" import "%~1" "%~2"
    )
)
goto end

:: Handle case with a single JSON file
:SingleFile
if "%~x1"==".json" (
    echo.
    echo Choose an export option:
    echo [1] Export Text
    echo [2] Export Translation
    echo.
    set /p choice="Enter your choice (1/2): "
)

if "%choice%"=="1" goto ExportSingle
if "%choice%"=="2" goto ExportTranslationSingle
goto end

:ExportSingle
echo python "%script_path%" export "%~1"
python "%script_path%" export "%~1"
goto end

:ExportTranslationSingle
echo python "%script_path%" export_translation "%~1"
python "%script_path%" export_translation "%~1"
goto end

:: Display choice menu when no arguments are provided
:NoArguments
echo.
echo Choose an option:
echo [1] Export Text
echo [2] Export Translation
echo [3] Import Text
echo.

set /p choice="Enter your choice (1/2/3): "
if "%choice%"=="1" goto ManualExport
if "%choice%"=="2" goto ManualExportTranslation
if "%choice%"=="3" goto ManualImport
goto end

:ManualExport
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export %json_path%
goto end

:ManualExportTranslation
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export_translation %json_path%
goto end

:ManualImport
set /p translation_path="Enter path to the translation text file: "
set /p json_path="Enter path to the JSON file: "
python "%script_path%" import %translation_path% %json_path%
goto end

:end
pause

"How do I use the script you wrote? You didn't provide any instructions. Can you explain how to use it?"

we put the bat file and the py file in the same folder. run the bat file, select the desired parameter and_ pass _the necessary arguments. Or you can drag the json file to the bat script and it will offer export. When transferring a PDF and a text document, it imports the translation from a text file

image image

lhj5426 avatar Oct 04 '23 12:10 lhj5426

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

If I haven't screwed up anywhere, here goes:

import json
import os

image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.tiff']

def clean_path(path):
    return path.strip('"')

def is_image_file(filename):
    _, extension = os.path.splitext(filename)
    return extension.lower() in image_extensions

def parse_translation_file_without_original(translation_file_content):
    translations = {}
    current_page = None
    for line in translation_file_content.splitlines():
        line = line.strip()
        if is_image_file(line):
            current_page = line
            translations[current_page] = []
            continue
        if current_page and ". " in line:
            _, translation_text = line.split(". ", 1)
            translations[current_page].append(translation_text)
    return translations

def export_text_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["text"][0] if item["text"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Text exported successfully to {output_file_path}"

def update_json_with_translation_without_original(json_file_path, translations):
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    for page, page_translations in translations.items():
        if page in data["pages"]:
            for idx, text_data in enumerate(data["pages"][page]):
                if idx < len(page_translations):
                    text_data["translation"] = page_translations[idx]
                    # Clearing the "rich_text"
                    text_data["rich_text"] = ""
    with open(json_file_path, 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    return "JSON file updated successfully."

def export_translation_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["translation"] if "translation" in item and item["translation"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_translation_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Translated text exported successfully to {output_file_path}"


def main():
    import sys
    operation = sys.argv[1]
    if operation == "export":
        json_path = sys.argv[2]
        print(export_text_from_json(json_path))
    elif operation == "export_translation":
        json_path = sys.argv[2]
        print(export_translation_from_json(json_path))
    elif operation == "import":
        translation_path = sys.argv[2]
        json_path = sys.argv[3]
        with open(translation_path, 'r', encoding='utf-8') as f:
            translation_content = f.read()
        translations = parse_translation_file_without_original(translation_content)
        print(update_json_with_translation_without_original(json_path, translations))
    else:
        print(f"Unknown operation: {{operation}}")


if __name__ == "__main__":
    main()
@echo off
cd %~dp0

:: Set the path to the Python script
set script_path=%~dp0extract_texts.py

:: Check the number of arguments
if "%~1"=="" goto NoArguments
if "%~2"=="" goto SingleFile
if not "%~3"=="" goto end

:: Handle case with two files
set first_ext=%~x1
set second_ext=%~x2

if "%first_ext%"==".json" (
    if "%second_ext%"==".txt" (
        python "%script_path%" import "%~2" "%~1"
    )
) else if "%first_ext%"==".txt" (
    if "%second_ext%"==".json" (
        python "%script_path%" import "%~1" "%~2"
    )
)
goto end

:: Handle case with a single JSON file
:SingleFile
if "%~x1"==".json" (
    echo.
    echo Choose an export option:
    echo [1] Export Text
    echo [2] Export Translation
    echo.
    set /p choice="Enter your choice (1/2): "
)

if "%choice%"=="1" goto ExportSingle
if "%choice%"=="2" goto ExportTranslationSingle
goto end

:ExportSingle
echo python "%script_path%" export "%~1"
python "%script_path%" export "%~1"
goto end

:ExportTranslationSingle
echo python "%script_path%" export_translation "%~1"
python "%script_path%" export_translation "%~1"
goto end

:: Display choice menu when no arguments are provided
:NoArguments
echo.
echo Choose an option:
echo [1] Export Text
echo [2] Export Translation
echo [3] Import Text
echo.

set /p choice="Enter your choice (1/2/3): "
if "%choice%"=="1" goto ManualExport
if "%choice%"=="2" goto ManualExportTranslation
if "%choice%"=="3" goto ManualImport
goto end

:ManualExport
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export %json_path%
goto end

:ManualExportTranslation
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export_translation %json_path%
goto end

:ManualImport
set /p translation_path="Enter path to the translation text file: "
set /p json_path="Enter path to the JSON file: "
python "%script_path%" import %translation_path% %json_path%
goto end

:end
pause

"How do I use the script you wrote? You didn't provide any instructions. Can you explain how to use it?"

we put the bat file and the py file in the same folder. run the bat file, select the desired parameter and_ pass _the necessary arguments. Or you can drag the json file to the bat script and it will offer export. When transferring a PDF and a text document, it imports the translation from a text file

image image

https://github.com/dmMaze/BallonsTranslator/assets/57861007/6d5e0c5f-ef61-4b98-8055-d19a28bdb434

bropines avatar Oct 04 '23 12:10 bropines

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

If I haven't screwed up anywhere, here goes:

import json
import os

image_extensions = ['.jpg', '.jpeg', '.png', '.gif', '.bmp', '.webp', '.tiff']

def clean_path(path):
    return path.strip('"')

def is_image_file(filename):
    _, extension = os.path.splitext(filename)
    return extension.lower() in image_extensions

def parse_translation_file_without_original(translation_file_content):
    translations = {}
    current_page = None
    for line in translation_file_content.splitlines():
        line = line.strip()
        if is_image_file(line):
            current_page = line
            translations[current_page] = []
            continue
        if current_page and ". " in line:
            _, translation_text = line.split(". ", 1)
            translations[current_page].append(translation_text)
    return translations

def export_text_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["text"][0] if item["text"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Text exported successfully to {output_file_path}"

def update_json_with_translation_without_original(json_file_path, translations):
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    for page, page_translations in translations.items():
        if page in data["pages"]:
            for idx, text_data in enumerate(data["pages"][page]):
                if idx < len(page_translations):
                    text_data["translation"] = page_translations[idx]
                    # Clearing the "rich_text"
                    text_data["rich_text"] = ""
    with open(json_file_path, 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    return "JSON file updated successfully."

def export_translation_from_json(input_file_path):
    input_file_path = clean_path(input_file_path)
    with open(input_file_path, 'r', encoding='utf-8') as json_file:
        data = json.load(json_file)
    result_texts = []
    for page, items in data["pages"].items():
        if is_image_file(page):
            result_texts.append(page)
            for idx, item in enumerate(items, start=1):
                text_content = item["translation"] if "translation" in item and item["translation"] else ""
                result_texts.append(f"{idx}. {text_content}")
            result_texts.append("")
    output_file_path = input_file_path.replace(".json", "_translation_exported.txt")
    with open(output_file_path, 'w', encoding='utf-8') as output_file:
        output_file.write('\n'.join(result_texts))
    return f"Translated text exported successfully to {output_file_path}"


def main():
    import sys
    operation = sys.argv[1]
    if operation == "export":
        json_path = sys.argv[2]
        print(export_text_from_json(json_path))
    elif operation == "export_translation":
        json_path = sys.argv[2]
        print(export_translation_from_json(json_path))
    elif operation == "import":
        translation_path = sys.argv[2]
        json_path = sys.argv[3]
        with open(translation_path, 'r', encoding='utf-8') as f:
            translation_content = f.read()
        translations = parse_translation_file_without_original(translation_content)
        print(update_json_with_translation_without_original(json_path, translations))
    else:
        print(f"Unknown operation: {{operation}}")


if __name__ == "__main__":
    main()
@echo off
cd %~dp0

:: Set the path to the Python script
set script_path=%~dp0extract_texts.py

:: Check the number of arguments
if "%~1"=="" goto NoArguments
if "%~2"=="" goto SingleFile
if not "%~3"=="" goto end

:: Handle case with two files
set first_ext=%~x1
set second_ext=%~x2

if "%first_ext%"==".json" (
    if "%second_ext%"==".txt" (
        python "%script_path%" import "%~2" "%~1"
    )
) else if "%first_ext%"==".txt" (
    if "%second_ext%"==".json" (
        python "%script_path%" import "%~1" "%~2"
    )
)
goto end

:: Handle case with a single JSON file
:SingleFile
if "%~x1"==".json" (
    echo.
    echo Choose an export option:
    echo [1] Export Text
    echo [2] Export Translation
    echo.
    set /p choice="Enter your choice (1/2): "
)

if "%choice%"=="1" goto ExportSingle
if "%choice%"=="2" goto ExportTranslationSingle
goto end

:ExportSingle
echo python "%script_path%" export "%~1"
python "%script_path%" export "%~1"
goto end

:ExportTranslationSingle
echo python "%script_path%" export_translation "%~1"
python "%script_path%" export_translation "%~1"
goto end

:: Display choice menu when no arguments are provided
:NoArguments
echo.
echo Choose an option:
echo [1] Export Text
echo [2] Export Translation
echo [3] Import Text
echo.

set /p choice="Enter your choice (1/2/3): "
if "%choice%"=="1" goto ManualExport
if "%choice%"=="2" goto ManualExportTranslation
if "%choice%"=="3" goto ManualImport
goto end

:ManualExport
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export %json_path%
goto end

:ManualExportTranslation
set /p json_path="Enter path to the JSON file: "
python "%script_path%" export_translation %json_path%
goto end

:ManualImport
set /p translation_path="Enter path to the translation text file: "
set /p json_path="Enter path to the JSON file: "
python "%script_path%" import %translation_path% %json_path%
goto end

:end
pause

"How do I use the script you wrote? You didn't provide any instructions. Can you explain how to use it?"

we put the bat file and the py file in the same folder. run the bat file, select the desired parameter and__ pass __the necessary arguments. Or you can drag the json file to the bat script and it will offer export. When transferring a PDF and a text document, it imports the translation from a text file

image image

explorer_liinulyThq.mp4

https://github.com/dmMaze/BallonsTranslator/assets/30548000/1d157077-50e8-45bb-92ea-afa3314fe7b6

T T

lhj5426 avatar Oct 04 '23 14:10 lhj5426

image extract_text.py.py

-_-

bropines avatar Oct 04 '23 15:10 bropines

image extract_text.py.py

-_-

"How should I name it?"

lhj5426 avatar Oct 04 '23 17:10 lhj5426

image extract_text.py.py -_-

"How should I name it?"

extract_texts.py

bropines avatar Oct 04 '23 17:10 bropines

image extract_text.py.py -_-

"How should I name it?"

extract_texts.py

image YES YES Hahahaha, I didn't even notice the duplicate file extension .py.py. Thank you very much.

lhj5426 avatar Oct 04 '23 17:10 lhj5426

I can package everything into a program with a simple gui to make things easier

bropines avatar Oct 05 '23 09:10 bropines

你是指word文档吗, 那个我可以加 如果是什么别的格式要求, 那个项目文件imgtrans_xxx.json里面就有原文,你可以自己写个脚本提取一下或者让chatcpt帮你

您高看我了,您在github写的软件使用说明我看的都云里雾里 我是外行中的外行

longfei796 avatar Oct 26 '23 16:10 longfei796

bandicam.2023-10-04.22-25-07-530.mp4 T T

这是Directory Opus和自定义脚本按钮?

gsxgmpkt9020 avatar Nov 14 '23 07:11 gsxgmpkt9020

bandicam.2023-10-04.22-25-07-530.mp4 T T

这是Directory Opus和自定义脚本按钮?

是的 第三方资源管理器 Directory Opus 整理资源神器

lhj5426 avatar Nov 14 '23 08:11 lhj5426