can you translate the epub directly? suggestion needed
so as i was , as usual was downloading epub file in a Chinese website in Chinese language as always usally after downloading it i convert it to a docx file than translate it to english and then again convert it to epub as you see its extremly inefficient i dont know how you guys do it so i would like some suggestion of you guys
so as i was saying after the last time i saw issue no #1654 i did some reserch (although not much like a layman) i saw this line of code
extractLanguage() { return "zh"; }
as you can see it extractLanguage: Specifies the language ("zh" for Chinese). so i was thinking maybe if we can can somehow manage to translate it to english it would be great so did some research ( using gpt ) i got my answer , yes you can actually translate it to english after some processing i got this code although i dont know if its feasible so its up to you guys to check out and satisfy our curiosity
"use strict";
const fs = require('fs');
const Epub = require('epub-gen'); // Install with: npm install epub-gen
parserFactory.register("novel543.com", () => new Novel543Parser());
class Novel543Parser extends Parser {
constructor() {
super();
}
async getChapterUrls(dom) {
let tocUrl = dom.baseURI + "dir";
let nextDom = (await HttpClient.wrapFetch(tocUrl)).responseXML;
let menu = nextDom.querySelector("div.chaplist ul:nth-of-type(2)");
return util.hyperlinksToChapterList(menu);
}
findContent(dom) {
return dom.querySelector("div.chapter-content");
}
extractTitleImpl(dom) {
return dom.querySelector("h1.title");
}
extractAuthor(dom) {
let authorLabel = dom.querySelector("span.author");
return authorLabel?.textContent ?? super.extractAuthor(dom);
}
extractLanguage() {
return "zh";
}
findCoverImageUrl(dom) {
return util.getFirstImgSrc(dom, "div.cover");
}
async fetchChapter(url) {
const content = await super.fetchChapter(url); // Fetch the original content
const translatedContent = await this.translateText(content); // Translate to English
return translatedContent;
}
moreChapterTextUrl(dom) {
let has2underscores = (s) => ((s.match(/_/g) || []).length === 2);
let nextUrl = [...dom.querySelectorAll(".foot-nav a")].pop();
return ((nextUrl != null) && has2underscores(nextUrl.href))
? nextUrl.href
: null;
}
getInformationEpubItemChildNodes(dom) {
return [...dom.querySelectorAll("div.intro")];
}
async translateText(text, targetLanguage = 'en') {
const apiUrl = 'https://libretranslate.com/translate'; // Public LibreTranslate instance
const payload = {
q: text,
source: 'zh', // Source language is always Chinese
target: targetLanguage,
format: 'text',
};
try {
const response = await fetch(apiUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify(payload),
});
const data = await response.json();
return data.translatedText; // Return the translated text
} catch (error) {
console.error('Error translating text:', error);
return text; // Return the original text if translation fails
}
}
async generateEpub(title, author, chapters, outputFilePath) {
const options = {
title: title,
author: author,
content: chapters.map((chapter, index) => ({
title: `Chapter ${index + 1}`,
data: chapter,
})),
};
await new Epub(options, outputFilePath).promise;
console.log(`EPUB file saved to ${outputFilePath}`);
}
async downloadAndTranslateNovel(outputFilePath) {
const title = "Translated Novel"; // Replace with actual title
const author = "Unknown Author"; // Replace with actual author
const chapters = [];
// Fetch and translate each chapter
const chapterUrls = await this.getChapterUrls(/* pass DOM here */);
for (const url of chapterUrls) {
const translatedContent = await this.fetchChapter(url);
chapters.push(translatedContent);
}
// Generate EPUB
await this.generateEpub(title, author, chapters, outputFilePath);
}
}
// Usage
const parser = new Novel543Parser();
parser.downloadAndTranslateNovel('translated_novel.epub');
I guess the problem is that a 3rd party site is used to translate the text and we don't know what they do with your data. If something like this gets implemented i would propose to implement it in https://github.com/dteviot/EpubEditor
@raven-celestia i would use a calibre plugin to translate the epub https://github.com/bookfere/Ebook-Translator-Calibre-Plugin/wiki/English#installation i didn't test it.
Not doing it in WebToEpub.