serverless-chrome
serverless-chrome copied to clipboard
Include Chinese/Japanese/Korean/more fonts in headless Chrome binary
From downstream issue in https://github.com/graphcool/chromeless/issues/43
Should include the relevant fonts for all scripts, not just kanji.
Resources:
- Use fontconfig? https://groups.google.com/a/chromium.org/forum/#!searchin/headless-dev/font/headless-dev/Pmnxb1lyDBg/DyNZYxMeBgAJ
- more hints in here? Browserless dockerfile
@adieuadieu we solved this for a subset of languages (CJK) in our lambda function that uses phantomjs today. We were able to do it by:
- packaging a
.fonts
directory with the TTF forNotoSansCJK-Regular.ttc
and including that in our function upload zip. - adding an environment variable in our lambda function console (on AWS) called
HOME
set to/var/task
. This allows Qt to pick up the included font.
The liability here is that due to the 50MB size limit on Lambda function packages you have to choose which fonts to be included carefully. In our case the Noto Sans font solved our issues, but I'm sure other fonts will be needed for other purposes.
I'm relaying most of this second-hand because one of my colleagues did most of the work earlier today. I'm going to try digging in further on it to make sure I got that right but that might be a possible solution for this project as well. I can try taking a crack at it next week if that would help, but I better get acclimated with this project more before doing any work. Nice work on this BTW!
@toddwprice oh that's great news! I hope Chrome looks for fonts in the same place.
About the 50MB limit: if you deploy your Lambda function with the deployment package in S3, the packageβs size limit increases dramaticallyβββtechnically 250MB (realistically more around 100MB when packaging less compressable data like executable binaries.) Forgive me for linking to myself: I recently wrote more about it in this article.
@adieuadieu wow we were early adopters of Lambda but never questioned the 5B limit. Great article! I will see if I can include Noto Sans for starters and if that works then we could add other fonts to plug other common holes.
*50MB
@adieuadieu I'm trying to get going on the project but getting errors with some missing dependencies and files when running npm test
. Let me know if you want me to post my errors here or ping you somewhere else. I'm using the develop
branch by the way. Thanks.
@toddwprice Jump over npm test
. Which folder are you working in? packages/lambda
may be the best place to play around in. There's a pesudo-integration test for Serverless there which you can use to deploy to Lambda. Run npm run build
in the packages/lambda
, then create a symlink in package/lambda/integration-test
for a dist
folder which points to the parent directory's dist
folder (packages/lambda/dist
)
My local setup (it's not so pretty..):
marco:integration-test marco$ pwd
/Users/marco/src/github/serverless-chrome/packages/lambda/integration-test
marco:integration-test marco$ ls -lhtra
total 260800
-rwxr-xr-x 1 marco 502 127M May 9 07:14 headless_shell
lrwxr-xr-x 1 marco 502 8B Jun 18 23:21 dist -> ../dist/
-rw-r--r-- 1 marco 502 463B Jul 10 18:48 serverless.yml
-rw-r--r-- 1 marco 502 789B Jul 10 18:48 handler.js
drwxr-xr-x 13 marco 502 442B Jul 10 18:48 ..
drwxr-xr-x 7 marco 502 238B Jul 10 18:48 .
@toddwprice Sureβwould at 18:30 CEST work? Could you DM me on Twitter or Gitter (@adieuadieu), or email (on my GitHub profile) so we can settle on a tool/service/share usernames to screen share?
Perfect. Sent you a message in gitter.
On Fri, Aug 4, 2017 at 9:53 AM, Marco LΓΌthy [email protected] wrote:
@toddwprice https://github.com/toddwprice Sureβwould at 18:30 CEST work? Could you DM me on Twitter or Gitter (@adieuadieu https://github.com/adieuadieu), or email (on my GitHub profile) so we can settle on a tool/service to screen share?
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/adieuadieu/serverless-chrome/issues/49#issuecomment-320269578, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7M1uw0usT3G2lMbmUmoLv-Poo27U3gks5sUzBOgaJpZM4OmTeV .
--
Todd Price [email protected]
@adieuadieu I'm struggling to get a good test running in Lambda without adding too many other dependencies. My current approach is to use chrome-remote-interface
directly inside the test handler. See this file: handler.js.zip.
Two problems so far:
-
Chrome spins up fine the first time, but fails afterwards. The logic around recognizing a running instance when a container is re-used is either not working or I've configured it wrong.
-
Screenshots are returning a blank page. I saw this behavior in the past when testing chrome --headless with chrome-remote-interface so it's likely something I'm doing wrong there.
Any pointers you could give me to get me on track with a valid test would be appreciated.
Hi @toddwprice thanks for the update. I would not worry too much about the first problem or adding too many dependencies. I would focus on just getting fonts working correctly with blatant disregard for anything else. Once fonts work, we'd have a proof-of-concept that it's possible. We can iterate from there to make it cleaner/easier.
With that in mind.. The example handler in this repository should work for capturing a screenshot, at least on Lambda. You might need to wait for the page to load before taking the screenshot. For a simple, mostly static page without any ajax-y behavior which occurs after the DOMContentLoaded event fires, you can wait on CDPs Page.loadEventFired()
Promise to resolve before doing Page.captureScreenshot()
.
Well this is probably no help since I don't even use AWS, but I'll share anyway. I run chromeless in a docker-container side by side with knqz/chrome-headless and to that I add:
ADD https://noto-website.storage.googleapis.com/pkgs/NotoSansCJKjp-hinted.zip /tmp
RUN unzip /tmp/NotoSansCJKjp-hinted.zip && \
mkdir -p /usr/share/fonts/noto && \
cp *.otf /usr/share/fonts/noto && \
chmod 644 -R /usr/share/fonts/noto/ && \
fc-cache -fv
(All of noto is 120MB and only regular is 15MB) after that at least japanese works fine. You probably have all that figured already, so sorry for being noisy!
http://qiita.com/dd511805/items/dfe03c5486bf1421875a
Thank you for the tip, @kumorig!
Please use it as a reference http://fd0.hatenablog.jp/entry/2017/09/10/223042 (sorry, written in Japanese)
- use custom fontconfig
- use small size font
- strip chrome binary
thanks @fd00, I tried exactly as your guide in the blog but it seems not working for me, I still get a lot tofu after deployed to lambda. Not sure if I missed anything :(. Edited: It works for font IPAexfont, but not working for Noto fonts. I managed to upload Noto fonts in the packaged, but seems like not working properly for fontconfig. (upload via S3 allowed you to deploy up to 250mb)
Updated: OTF fonts from google do not work for me, but TTC from google does work well with fontconfig
following the guide of @fd00 thanks a lot, mate :).
TTC fonts can be downloaded via https://github.com/googlei18n/noto-cjk
another update: since [email protected]
with headless_shell
changing to headless-chromium
, the method of @fd00 stopped working and the tofus are now coming back :(
@nmqanh you might just need to change the name/paths in a few steps from headless_shell
to headless-chrome
. For example, in the article, in the Deploy section, there is reference to CHROME_PATH
pointing at headless_shell
. Change this to headless-chrome
.
I have an implementation of font support in progress that I'll finish sometime over the next week or two which will close this Issue.
Just a quick update: I tried updating the CHROME_PATH
and also re-built the font cache from step 0 as guided in the article and it does not work, tofus are still coming back with [email protected]
and later. Thanks for the good news that new release gonna support CJK fonts by default in 1-2 weeks :). Would love to try it soon. Please let me know if there is anything I can help.
@adieuadieu We have tried several times to add support for this, but still a dead end. Have you manage to figure it out? Can we assist with something? Thank you for your great work.
I tried many ways to bring CJK fonts back to headless but I could not :(. Was anyone here able to do that? please help me, I appreciate a lot, thanks. This only started breaking since [email protected]
, it works fine with [email protected]
and lower versions.
Thanks all.
I got it to work for my own setup. I documented the process I used with a little more detail than the other blog post here: https://gist.github.com/nat-n/c3429d29f2478ccb3de243810bb12956
Thanks @nat-n , it works like a charm. The main reason were that from version 1.0.0-6 the symlink failed to run, it used to work with 1.0.0-5 and lower versions .
@nat-n I've been able to include the ipaexg font into the chromeless using this method.
Docker container created, once done rsync'd it into the chromeless path so it looks like the following...
chromeless/serverless/node_modules/@serverless-chrome/lambda/dist/fontconfig/etc/fonts
and within my serverless.yml
name: aws
runtime: nodejs6.10
stage: ${self:custom.stage}
region: eu-west-1
environment:
DEBUG: ${self:custom.debug}
AWS_IOT_HOST: ${self:custom.awsIotHost}
FONTCONFIG_PATH: /var/task/node_modules/@serverless-chrome/lambda/dist/fontconfig/etc/fonts
LD_LIBRARY_PATH: /var/task/node_modules/@serverless-chrome/lambda/dist/fontconfig/usr/lib
@nat-n I tried to follow your note, but I'm missing knowledge from "Configuring fontconfig" to the end. Could you please detail more on how to do it or give links to learn what I'm missing. Thanks
@luminous8 I can try to help, but I'm not sure what you're missing. The general idea is that the fontconfig built inside the container also exists under /tmp outside the container, and so you can make some requires changes to it there, before running some commands from inside the container to complete the setup. I've just fixed a formatting issue that might have made a part of if less clear but I'm afraid I can't make the instructions too concrete without making them to specific to a particular setup (which may be different from your own).
For anyone stumbling at this at some point in the future, I just wanted to mention that what @toddwprice did:
- Upload fonts in a
.fonts
directory. - Set
$HOME
env var to/var/task
.
Worked just fine without the need to build fontconfig or the other extra steps.
Agree with @arikfr. Shipped the following with our Ξ».
$ tree -la
.
βββ chromium
βββ .fonts
βββ NotoColorEmoji.ttf
βββ NotoEmoji-Regular.ttf
βββ NotoSansArabic-Bold.ttf
βββ NotoSansArabic-Regular.ttf
βββ NotoSansCJKjp-Bold.otf
βββ NotoSansCJKjp-Regular.otf
βββ NotoSansCJKkr-Bold.otf
βββ NotoSansCJKkr-Regular.otf
βββ NotoSansCJKsc-Bold.otf
βββ NotoSansCJKsc-Regular.otf
βββ NotoSansCJKtc-Bold.otf
βββ NotoSansCJKtc-Regular.otf
βββ NotoSansHebrew-Bold.ttf
βββ NotoSansHebrew-Regular.ttf
βββ NotoSansMongolian-Regular.ttf
βββ NotoSansThai-Bold.ttf
βββ NotoSansThai-Regular.ttf
Unpacked as is to /var/task
. After setting $HOME
to /var/task
we were able to confirm CJK characters rendered.
I followed @toddwprice's suggestion and it works locally but not in lambda. Then I tried *.otf file instead of *.tcc file as @abargnesi suggests, it works both locally and in lambda. The font I used is NotoSansCJKtc-Black.otf
So you may try both font files and see if any one of them works.
In case this saves someone else some work.
I got fonts working by putting them in .fonts and setting HOME=/var/task
, however it didn't work for me until I made the font files have permission 644 (-rw-r--r--).
@NickBlow, @abargnesi
I'm still unable to get these characters to render. Can somebody clarify this for me please - where does this .fonts
directory go?
I added the same fonts as listed in @abargnesi 's comment, changed their perms to 644, and tried:
- adding this directory to `chrome/chrome-headless-lambda-linux-x64.tar.gz' archive
- adding this directory at top level of
serverless-chrome
git repo root - adding it to
.serverless/serverless-chrome.zip
archive
and deploying, but my characters are rendered as empty space (not even squares).
Any ideas?
Thank you .