rootstock icon indicating copy to clipboard operation
rootstock copied to clipboard

Bitcoin sign (₿, U+20BF) doesn't render in PDF and some browsers

Open dhimmel opened this issue 7 years ago • 15 comments

As commented by @arielsvn in https://github.com/greenelab/scihub-manuscript/pull/51#issuecomment-362137397:

there seems to be an encoding issue with the bitcoin symbol on the Discussion section. I noticed it on the pdf, and the same happens with the markdown file, at least on my computer.

screenshot

This is likely due to the unicode character (₿, U+20BF) a recent addition as part of Unicode 10.0, released June 2017. Note this release has other important symbols/emojis such as 🧟 (Zombie) and 🧖 (Person in Steamy Room).

For me, on Chrome on Ubuntu 17.10, the bitcoin sign renders in the HTML but not the PDF. I'm assuming the PDF gets a certain font embedded on Travis CI, which doesn't have the latest characters. Note that when I generate the PDF locally, the bitcoin signs do render.

So @arielsvn, I think we may want to look into the following solutions:

  • Updating the font used by the Travis CI build
  • Specifying a font to use that is up to date

@arielsvn you probably know best what to do here.

dhimmel avatar Feb 01 '18 11:02 dhimmel

Potentially relevant links:

  • http://unifoundry.com/unifont.html
  • https://github.com/eosrei/twemoji-color-font
  • https://askubuntu.com/a/984275
  • https://askubuntu.com/questions/689607
  • https://www.filamentgroup.com/lab/font-loading.html

dhimmel avatar Feb 01 '18 11:02 dhimmel

Thanks for the info @dhimmel, I'll look into it.

In my case I was using Chrome on Ubuntu 16.04, and the bitcoin sign didn't render correctly on the html either.

arielsvn avatar Feb 01 '18 14:02 arielsvn

I'll look into it

Awesome. Do you think we should load a font or leave it up to viewers to have up-to-date fonts? For reference, the CSS we're using is github-pandoc.css.

dhimmel avatar Feb 01 '18 15:02 dhimmel

I think is best to load a font, that way we would be certain that the generated document looks the same for anyone viewing it.

arielsvn avatar Feb 01 '18 15:02 arielsvn

I think is best to load a font

Do you want to implement this? And if so, what's your timeline? I'm wondering whether we should wait to post to Sci-Hub Manuscript v3... I could always print to PDF as a workaround.

dhimmel avatar Feb 01 '18 15:02 dhimmel

Do you want to implement this? And if so, what's your timeline?

Sure! Let me give it a quick try now, and I'll let you know if I run into any issues that may require more time.

arielsvn avatar Feb 01 '18 15:02 arielsvn

@dhimmel do you have any preference for what font to use??

arielsvn avatar Feb 01 '18 16:02 arielsvn

Not really.

Preferably one that is openly licensed although I don't know much about the interplay of fonts and copyright.

Perhaps something similar to what PeerJ uses (example)?

@agitter do you have any insight?

dhimmel avatar Feb 01 '18 17:02 dhimmel

PeerJ seems to be using Helvetica with fallback to Arial.

Another solution could be using the HTML entity ฿ of the bitcon character. That way it gets rendered correctly, even on my computer where I haven't updated the font.

arielsvn avatar Feb 01 '18 18:02 arielsvn

do you have any insight?

Sorry, I have no ideas about the best font.

agitter avatar Feb 01 '18 20:02 agitter

Another solution could be using the HTML entity ฿ of the bitcon character. That way it gets rendered correctly, even on my computer where I haven't updated the font.

I don't understand this? Why should how the character is represented in plain text affect whether your browser can display it?

dhimmel avatar Feb 02 '18 14:02 dhimmel

I don't understand this? Why should how the character is represented in plain text affect whether your browser can display it?

Me neither, I need to read more about it. On my computer I see they are rendered differently: (₿, U+20BF) vs (฿, ฿). My guess is that these are different characters.

₿ != ฿

Edit: Here's how I see these characters here on Github.

image

arielsvn avatar Feb 02 '18 15:02 arielsvn

₿ != ฿

Same behavior for me in both browsers I tested

agitter avatar Feb 02 '18 16:02 agitter

This is because ฿ encodes U+0E3F rather than U+20BF. ฿ is the Thai currency symbol Baht, not the bitcoin currency sign. ฿ has been around since Unicode 1.1.0 (June, 1993).

dhimmel avatar Feb 02 '18 17:02 dhimmel

Oh that explains the difference, I must have made a mistake when pasting the code ฿, I thought I had copied it from U+20BF but I see now that it has a different hex code ₿.

I'll try to look for a font that supports the latest unicode characters and embed it with the build.

arielsvn avatar Feb 05 '18 22:02 arielsvn