leech icon indicating copy to clipboard operation
leech copied to clipboard

Space added after italicized words

Open TheMetalCenter opened this issue 5 years ago • 5 comments

Most of the time, but not always, a space is inserted after an italicized word that is followed by a comma.

Example “I cannot mean otherwise , Elvera.” Abigail , she told herself “ Enough ,” - this one looks like it may have added a space before the word too.

Thanks for this awesome work!

TheMetalCenter avatar Apr 17 '19 04:04 TheMetalCenter

I've only ever noticed that on Practical Guide (which, indeed, this quote is referencing). It's something to do with how the whitespace is getting normalized, I think, but I haven't tracked down //exactly// what it is.

The relevant source, for later debugging reference: &#8220;I cannot mean <em>otherwise</em>, Elvera.&#8221; <em>Look calm, Abigail</em>, she told herself. <p>&#8220;<em>Enough</em>,&#8221; Razin Tanja hissed.</p>

kemayo avatar Apr 17 '19 04:04 kemayo

I actually just downloaded the new version (mine was a year old or so) and the problem no longer occurs. So it looks like you already fixed it!

Thanks again

TheMetalCenter avatar Apr 17 '19 04:04 TheMetalCenter

I was mistaken, this issue still often occurs. Reopening for awareness, although it isn't a huge issue.

TheMetalCenter avatar Apr 24 '20 01:04 TheMetalCenter

I poked around a little bit on this and found examples before and after in ebook. This occurs when the </em> is directly before a punctuation mark (commas, periods, question marks, semicolons, quotations, etc.).

EDIT: Also occurs if punctuation is directly before <em> , but in English this is mostly restricted to quotation or apostrophes.

Example from Practical Guide to Evil Chapter 28 Win Condition In html of website: Gods of my mother, take this offering and <em>grant me the wrath of Heaven</em>.”

In epub: Gods of my mother, take this offering and <em> grant me the wrath of Heaven </em> .”

A temporary fix can be done in calibre edit book function and search for </em> . and replace with </em>. then repeat with other marks (, ; ! ? " :). Such a search and replace feature could probably be implemented directly in package, although there is a small chance it could break something in other books some cases.

Github converted line breaks into spaces in code (spaces and line breaks handled differently calibre ctrl+F search and replace), so here is picture for clarity image

TheMetalCenter avatar May 17 '20 18:05 TheMetalCenter

Devised an easier fix using regexp strings with Calibre for anyone else who is interested in fixing this in their copies (instances in the hundreds in some ebooks so can be an annoying issue).

You can either (1) edit the epub directly, or (2) have the changes apply during conversion (recommended, works with single or batch) (1) For direct editing, calibre allows the import of a JSON file for batch search/replace functions. I have attached the code of my JSON below for various punction marks. This feature is accessed in Edit Book > Search > Saved Search > Import

(2) I recommend instead applying this during conversion, so the original epub is intact if anything gets messed up. Calibre allows for the loading of CSR files (fancy txt files) for batch search/replace. I have attached my code for this below.
This option is found in convert books > choose single or bulk > search & replace > Load

Files below have the wrong extension, and probably need to be corrected before use (.json or .csr) calibreleechsearchesconversion.txt calibreleechsearches.txt

TheMetalCenter avatar May 18 '20 02:05 TheMetalCenter