org_to_anki icon indicating copy to clipboard operation
org_to_anki copied to clipboard

HTML elements not rendered correctly in question statements

Open beniTrainor opened this issue 4 years ago • 5 comments

Please follow the template below to report your issue.

  • [x] Raw text
  • [ ] Error report provided by Anki
  • [x] Operatin system
  • [x] Original file type

Please include the following information:

Any general information

Hi there! I want to be able to write the following question:

* How do you generate <li> elements with pug

But when I run org_to_anki the tags ("<", ">") dissappear and the line breaks into multiple lines, like so:

How do you generate
elements with pug

If instead I use &lt; and &gt; in the question statement it works fine. But the character sequences are fairly tedious to write. Is there any way around this?

I've also tried using accents (`) to escape the characters but it gives me the following:

How do you generate `
` elements with pug

Raw text of the file you tried to upload

* How do you generate <li> elements with pug

Error report from the popup

There's no error report from Anki.

What is your operating system

Ubuntu 16.04.

What was the original file type

Text file.

beniTrainor avatar Oct 04 '20 06:10 beniTrainor

I've found standard library function that does exactly that:

from html import escape
escape("<li>")

Output:

 '&lt;li&gt;'

I think it only works on certain versions of Python though, but I'm not completely sure. If you want more information, check out this StackOverflow question from which I found the answer.

Could this be included into org_to_anki to automatically encode HTML special characters?

beniTrainor avatar Oct 04 '20 07:10 beniTrainor

I've made a simple (maybe dirty) hack to get around this by modifying the _formatFile function located at src/org_to_anki/org_parser/parseData.py:

def _formatFile(filePath):# (filePath: str):

    from html import escape

    with open(filePath, mode="r", encoding="utf-8") as file:
        data = file.read().split('\n')

        # escape HTML special chars
        data = [escape(line) for line in data]

    return data

Now it works like a charm.

Let me know if you want me to create a PR or something to include this feature/fix. I haven't written tests or documentation though. It's just a little hack to get me going.

beniTrainor avatar Oct 04 '20 08:10 beniTrainor

It turns out my previous answer created another problem. Now, the code blocks with < / > are escaped into &lt; / &gt;, respectively.

So, I've reverted the change made and added the following to src/org_to_anki/AnkiClasses/AnkiQuestionFactory.py:

class AnkiQuestionFactory:
    ...
    def buildQuestion(self):

        from html import escape

        ...

        for line in self.currentQuestions:
            line = self.utils.removeAsterisk(line)
            line = self.utils.formatLine(line)
+          # Escape HTML special characters
+          line = escape(line)
            line = self.utils.parseAnswerLine(line, self.filePath, newQuestion)
            newQuestion.addQuestion(line)
        ...

Now it does what I want.

I haven't followed much the conventions used for the project because I just wanted to get this to work. Maybe it would be better to move the changes into the DeckBuilderUtils or somewhere else.

beniTrainor avatar Oct 05 '20 13:10 beniTrainor

Sorry for the delay in getting back to you! busy weekend.

I will have a look over it it the next few days I would like to be able to support such functionality. I think your last comment for line = escape(line) looks like the right idea. There should be some way of encoding it so that it get correctly escaped.

I will mark this as a bug and try get a fix done this week

c-okelly avatar Oct 05 '20 17:10 c-okelly

Ok, thanks! By the way, I've found another related issue. Even with this fix, if you write HTML characters in answer statements they don't get encoded correctly and end up breaking the lines.

Example file:

* Question <li>
** Answer <li>

Output (in Anki):

Question <li>
- Answer
- 

Note: the answer line breaks into two.

I think if you were to escape all text (questions and answers), except for code blocks (which already escape these characters) it would work.

I don't know exactly what exactly has to be done. I'll try to look into it if I have time this week.

beniTrainor avatar Oct 06 '20 08:10 beniTrainor