WebToEpub icon indicating copy to clipboard operation
WebToEpub copied to clipboard

can you translate the epub directly? suggestion needed

Open raven-celestia opened this issue 10 months ago • 2 comments

so as i was , as usual was downloading epub file in a Chinese website in Chinese language as always usally after downloading it i convert it to a docx file than translate it to english and then again convert it to epub as you see its extremly inefficient i dont know how you guys do it so i would like some suggestion of you guys

so as i was saying after the last time i saw issue no #1654 i did some reserch (although not much like a layman) i saw this line of code

extractLanguage() { return "zh"; }

as you can see it extractLanguage: Specifies the language ("zh" for Chinese). so i was thinking maybe if we can can somehow manage to translate it to english it would be great so did some research ( using gpt ) i got my answer , yes you can actually translate it to english after some processing i got this code although i dont know if its feasible so its up to you guys to check out and satisfy our curiosity

"use strict";

const fs = require('fs');
const Epub = require('epub-gen'); // Install with: npm install epub-gen

parserFactory.register("novel543.com", () => new Novel543Parser());

class Novel543Parser extends Parser {
    constructor() {
        super();
    }

    async getChapterUrls(dom) {
        let tocUrl = dom.baseURI + "dir";
        let nextDom = (await HttpClient.wrapFetch(tocUrl)).responseXML;
        let menu = nextDom.querySelector("div.chaplist ul:nth-of-type(2)");
        return util.hyperlinksToChapterList(menu);
    }

    findContent(dom) {
        return dom.querySelector("div.chapter-content");
    }

    extractTitleImpl(dom) {
        return dom.querySelector("h1.title");
    }

    extractAuthor(dom) {
        let authorLabel = dom.querySelector("span.author");
        return authorLabel?.textContent ?? super.extractAuthor(dom);
    }

    extractLanguage() {
        return "zh";
    }

    findCoverImageUrl(dom) {
        return util.getFirstImgSrc(dom, "div.cover");
    }

    async fetchChapter(url) {
        const content = await super.fetchChapter(url); // Fetch the original content
        const translatedContent = await this.translateText(content); // Translate to English
        return translatedContent;
    }

    moreChapterTextUrl(dom) {
        let has2underscores = (s) => ((s.match(/_/g) || []).length === 2);
        let nextUrl = [...dom.querySelectorAll(".foot-nav a")].pop();
        return ((nextUrl != null) && has2underscores(nextUrl.href))
            ? nextUrl.href
            : null;
    }

    getInformationEpubItemChildNodes(dom) {
        return [...dom.querySelectorAll("div.intro")];
    }

    async translateText(text, targetLanguage = 'en') {
        const apiUrl = 'https://libretranslate.com/translate'; // Public LibreTranslate instance
        const payload = {
            q: text,
            source: 'zh', // Source language is always Chinese
            target: targetLanguage,
            format: 'text',
        };

        try {
            const response = await fetch(apiUrl, {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                },
                body: JSON.stringify(payload),
            });

            const data = await response.json();
            return data.translatedText; // Return the translated text
        } catch (error) {
            console.error('Error translating text:', error);
            return text; // Return the original text if translation fails
        }
    }

    async generateEpub(title, author, chapters, outputFilePath) {
        const options = {
            title: title,
            author: author,
            content: chapters.map((chapter, index) => ({
                title: `Chapter ${index + 1}`,
                data: chapter,
            })),
        };

        await new Epub(options, outputFilePath).promise;
        console.log(`EPUB file saved to ${outputFilePath}`);
    }

    async downloadAndTranslateNovel(outputFilePath) {
        const title = "Translated Novel"; // Replace with actual title
        const author = "Unknown Author"; // Replace with actual author
        const chapters = [];

        // Fetch and translate each chapter
        const chapterUrls = await this.getChapterUrls(/* pass DOM here */);
        for (const url of chapterUrls) {
            const translatedContent = await this.fetchChapter(url);
            chapters.push(translatedContent);
        }

        // Generate EPUB
        await this.generateEpub(title, author, chapters, outputFilePath);
    }
}

// Usage
const parser = new Novel543Parser();
parser.downloadAndTranslateNovel('translated_novel.epub');

raven-celestia avatar Feb 03 '25 11:02 raven-celestia

I guess the problem is that a 3rd party site is used to translate the text and we don't know what they do with your data. If something like this gets implemented i would propose to implement it in https://github.com/dteviot/EpubEditor

gamebeaker avatar Feb 03 '25 11:02 gamebeaker

@raven-celestia i would use a calibre plugin to translate the epub https://github.com/bookfere/Ebook-Translator-Calibre-Plugin/wiki/English#installation i didn't test it.

gamebeaker avatar Feb 03 '25 13:02 gamebeaker

Not doing it in WebToEpub.

dteviot avatar Feb 23 '25 01:02 dteviot