clerk icon indicating copy to clipboard operation
clerk copied to clipboard

Cache assets locally to enable working offline

Open mk opened this issue 2 years ago • 34 comments

mk avatar Mar 16 '22 10:03 mk

@mk I think this PR is ready for another review. A brief explanation:

  • There is a function clojure -X nextjournal.clerk/cache-assets! you can call if you're on the wifi of the airport.
  • After that, all development can happen without an internet connection, e.g. in an airplane.
  • All assets are served through the /assets endpoint. If they weren't previously cached via clojure -X nextjournal.clerk/cache-assets! they will still be cached locally on first usage, so in case you forgot to run that function, but already developed, then you will have those files locally anyway.
  • When adding assets, these need to be added in the bb script and the bb task hash-assets needs to be ran to update resources/asset_manifest.edn.

borkdude avatar Mar 21 '22 20:03 borkdude

@borkdude one thing I forgot to mention on our call is if we can incorporate our learnings from https://github.com/nextjournal/dejavu/issues/4 here.

mk avatar Mar 22 '22 16:03 mk

@mk Isn't that automatically solved when using the new dejavu?

borkdude avatar Mar 22 '22 16:03 borkdude

@mk I think all things have been addressed that we discussed yesterday. I will ask in #30 if people want to test this PR.

borkdude avatar Mar 23 '22 11:03 borkdude

Tested this locally using clerk as a git dep and it worked great.

(ns user
  (:require [nextjournal.clerk :as clerk]))

(comment
  (nextjournal.clerk/cache-assets! {})
  (clerk/serve! {})
  (clerk/show! "notebooks/dice.clj")
  )

We could of course also decide to run cache-assets! automatically when you call clerk/serve! or so.

borkdude avatar Mar 29 '22 15:03 borkdude

It does work well for me once I requested a notebook when online. When only calling nextjournal.clerk/cache-assets! without ever serving a notebook, we seem to be missing a bunch of files however:

$ ls .clerk/.cache/*/*
.clerk/.cache/assets/36q3Vv52BMdq6KArT8NxBBcm5xXwGTQ4tXkFwjKDkSZx8TJL47KhhHDadZkLVLZSqXGuSLeoL9Mm1Em87cusoqVN?name=tailwind.css
.clerk/.cache/assets/3uEQGQ9cBUjLVQBay2VqQsMkT8RmYjctuV356d82PSAH8qX7eQsXXC4FXzSz3K3YSWFs9ZVw2fBdBZD3r8oeCYBj?name=Fira+Code.css
.clerk/.cache/lookup/3fXMVxKEk34PRw2hKjU7NRy9o5vJ

After showing a notebook on the first request, they're there:

$ ls .clerk/.cache/*/*
.clerk/.cache/assets/2wrujFq9FbBNiC7dyA6ESW9vvp5Nx3EstRjvfZwpy1Ra1PHW1hdVc5sXwhp4rJF9Xksu1R3BaPgQEevvx8acCKZu
.clerk/.cache/assets/36q3Vv52BMdq6KArT8NxBBcm5xXwGTQ4tXkFwjKDkSZx8TJL47KhhHDadZkLVLZSqXGuSLeoL9Mm1Em87cusoqVN
.clerk/.cache/assets/36q3Vv52BMdq6KArT8NxBBcm5xXwGTQ4tXkFwjKDkSZx8TJL47KhhHDadZkLVLZSqXGuSLeoL9Mm1Em87cusoqVN?name=tailwind.css
.clerk/.cache/assets/3UUKLrYWC5yYHepSePNSjneRgKkeupqXVR9Qu5sMRh31S6nJDmAusn1AP9r9hzoX4zPQNzDF59iPnNykSiNjt9fA
.clerk/.cache/assets/3uEQGQ9cBUjLVQBay2VqQsMkT8RmYjctuV356d82PSAH8qX7eQsXXC4FXzSz3K3YSWFs9ZVw2fBdBZD3r8oeCYBj
.clerk/.cache/assets/3uEQGQ9cBUjLVQBay2VqQsMkT8RmYjctuV356d82PSAH8qX7eQsXXC4FXzSz3K3YSWFs9ZVw2fBdBZD3r8oeCYBj?name=Fira+Code.css
.clerk/.cache/assets/4R867PoGjDfokDKtsdXkQymyqPFFjMecXgkaoGGbKssw9CScH28Ro8soEXyEJVF9YPh4p1oB6EGXcEbvu5QrxRTb
.clerk/.cache/assets/4dy4nMTjQ9by3rB5rb8jroM3FvUqLm2Dboqent2P43MAyntTD1wkUCBAjNzHMCpybgyMrMhN1sjXkGsdkao3uTKF
.clerk/.cache/lookup/3fXMVxKEk34PRw2hKjU7NRy9o5vJ

Another thing I'm noticing is that the css files exist with and without the ?name param:

36q3Vv52BMdq6KArT8NxBBcm5xXwGTQ4tXkFwjKDkSZx8TJL47KhhHDadZkLVLZSqXGuSLeoL9Mm1Em87cusoqVN
36q3Vv52BMdq6KArT8NxBBcm5xXwGTQ4tXkFwjKDkSZx8TJL47KhhHDadZkLVLZSqXGuSLeoL9Mm1Em87cusoqVN?name=tailwind.css
3uEQGQ9cBUjLVQBay2VqQsMkT8RmYjctuV356d82PSAH8qX7eQsXXC4FXzSz3K3YSWFs9ZVw2fBdBZD3r8oeCYBj
3uEQGQ9cBUjLVQBay2VqQsMkT8RmYjctuV356d82PSAH8qX7eQsXXC4FXzSz3K3YSWFs9ZVw2fBdBZD3r8oeCYBj?name=Fira+Code.css

mk avatar Mar 30 '22 06:03 mk

Also noticing now that we're serving assets with a content-type of application/octet-stream:

$ curl -I "https://storage.googleapis.com/nextjournal-cas-eu/data/3uEQGQ9cBUjLVQBay2VqQsMkT8RmYjctuV356d82PSAH8qX7eQsXXC4FXzSz3K3YSWFs9ZVw2fBdBZD3r8oeCYBj?name=Fira+Code.css"
HTTP/2 200 
x-guploader-uploadid: ADPycdvvRcwz-TQ0DS1GieFWn7QRMCnyNXk1_XDBjWRX_iINMr0AOhkQ4A0_eMKyjhxuzrh5W5GWhVf-2i8ZNQ0Dqu0Xw0uq6g
x-goog-generation: 1648033471505151
x-goog-metageneration: 1
x-goog-stored-content-encoding: identity
x-goog-stored-content-length: 3792
content-language: en
x-goog-hash: crc32c=DvM9AQ==
x-goog-hash: md5=gImMytMCOi6K6vf3N9ScIA==
x-goog-storage-class: STANDARD
accept-ranges: bytes
content-length: 3792
server: UploadServer
date: Wed, 30 Mar 2022 06:24:12 GMT
expires: Wed, 30 Mar 2022 07:24:12 GMT
cache-control: public, max-age=3600
last-modified: Wed, 23 Mar 2022 11:04:31 GMT
etag: "80898ccad3023a2e8aeaf7f737d49c20"
content-type: application/octet-stream
age: 2134
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

mk avatar Mar 30 '22 07:03 mk

TODO:

  • [ ] solve merge conflicts
  • [x] attach mime type or use extension to fix octet-stream mime type into something more appropriate: the octet-stream mime type is coming from the .ttf fonts. It's not wrong I believe?
  • [x] store cached files in a different directory, so they don't get wiped
  • [x] only two files get cached when you run cache-assets!
  • [x] Another thing I'm noticing is that the css files exist with and without the ?name param: can't reproduce, maybe from a previous run?
  • [ ] One thing we're still missing for complete offline support seems to be Vega and Plotly (try (clerk/show! "notebooks/viewer_api.clj"

borkdude avatar Apr 04 '22 11:04 borkdude

@mk See above TODOs, most should be solved. About vega/plotly, I see this is coming from viewers:

borkdude@MBP2019 ~/dev/viewers (main) $ rg "vega-embed"
resources/css/viewer.css
349:.vega-embed .chart-wrapper { @apply h-auto !important; }

modules/ui/src/nextjournal/ui/components/d3_require.cljs
22: [with {:package ["[email protected]"]}

modules/viewer/src/nextjournal/viewer/vega_lite.cljs
8:    [d3-require/with {:package ["[email protected]"]}

Is it realistic/expected that caching should take into account dynamically loaded assets? Any ideas how to approach that?

borkdude avatar Apr 04 '22 13:04 borkdude

@borkdude Just tried this out locally and the fonts were missing (i.e. the CSS referred to fonts that were not served via /assets). Also my log was spammed with these two warnings up to the point were I had to kill the Clerk server:

[clerk] WARNING - url does not exist in manifest:  https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css
[clerk] WARNING - uncached url: https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css

philippamarkovics avatar Apr 08 '22 10:04 philippamarkovics

@joe-loco Which commit did you try out? Were they missing as in a cache miss or did they not load at all?

borkdude avatar Apr 08 '22 10:04 borkdude

@borkdude I tried the latest commit in this branch. The fonts did not load at all.

philippamarkovics avatar Apr 08 '22 10:04 philippamarkovics

Thanks for testing. I'll take a look after my vacation (back after next week).

borkdude avatar Apr 08 '22 10:04 borkdude

@joe-loco Can you please specify which notebook you were viewing?

borkdude avatar Apr 19 '22 11:04 borkdude

I think it was the Rule 30 one but I also believe I looked at multiple and the issues were everywhere (I might be wrong about that last part though).

philippamarkovics avatar Apr 19 '22 11:04 philippamarkovics

@joe-loco I stared at this for too long. I even preserved the names of the fonts in the CSS in this commit and don't see anything loading from external sources, except the katex.min.css which we decided to leave external for now. Yet the font doesn't render as it normally does.

Screenshot 2022-04-19 at 16 23 08

borkdude avatar Apr 19 '22 14:04 borkdude

The CSS is loaded from here:

http://localhost:7777/assets/36rHBCKZuQoGayvSbwoGsc4RKUf3zSe58S6etnk6EgCBLfEsBsNpRXnPMiUjP3EekgSDY4kxaaxqzu7P9jiXL8k4?name=Fira+Code.css

and looks like this:

@font-face {
  font-family: 'Fira Code';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url(/assets/uU9eCBsR6Z2vfE9aq3bL0fxyUs4tcw4W_D1sFVc.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Code';
  font-style: normal;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/uU9eCBsR6Z2vfE9aq3bL0fxyUs4tcw4W_NprFVc.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Mono';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url(/assets/N0bX2SlFPv1weGeLZDtQIQ.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Mono';
  font-style: normal;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/N0bS2SlFPv1weGeLZDtondv3mQ.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans';
  font-style: italic;
  font-weight: 400;
  font-display: swap;
  src: url(/assets/va9C4kDNxMZdWfMOD5VvkojO.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans';
  font-style: italic;
  font-weight: 500;
  font-display: swap;
  src: url(/assets/va9f4kDNxMZdWfMOD5VvkrA6Qhf_.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans';
  font-style: italic;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/va9f4kDNxMZdWfMOD5VvkrByRBf_.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url(/assets/va9E4kDNxMZdWfMOD5VfkA.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans';
  font-style: normal;
  font-weight: 500;
  font-display: swap;
  src: url(/assets/va9B4kDNxMZdWfMOD5VnZKvuQQ.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans';
  font-style: normal;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/va9B4kDNxMZdWfMOD5VnLK3uQQ.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans Condensed';
  font-style: italic;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/wEOuEADFm8hSaQTFG18FErVhsC9x-tarUfPVFMZ0dw.ttf) format('truetype');
}
@font-face {
  font-family: 'Fira Sans Condensed';
  font-style: normal;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/wEOsEADFm8hSaQTFG18FErVhsC9x-tarWU3IiMM.ttf) format('truetype');
}
@font-face {
  font-family: 'PT Serif';
  font-style: italic;
  font-weight: 400;
  font-display: swap;
  src: url(/assets/EJRTQgYoZZY2vCFuvAFTzro.ttf) format('truetype');
}
@font-face {
  font-family: 'PT Serif';
  font-style: italic;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/EJRQQgYoZZY2vCFuvAFT9gaQVy4.ttf) format('truetype');
}
@font-face {
  font-family: 'PT Serif';
  font-style: normal;
  font-weight: 400;
  font-display: swap;
  src: url(/assets/EJRVQgYoZZY2vCFuvDFR.ttf) format('truetype');
}
@font-face {
  font-family: 'PT Serif';
  font-style: normal;
  font-weight: 700;
  font-display: swap;
  src: url(/assets/EJRSQgYoZZY2vCFuvAnt65qV.ttf) format('truetype');
}

borkdude avatar Apr 19 '22 14:04 borkdude

It seems even when adding an incorrect URL in the CSS, I don't see a 404 anywhere (xassets instead of assets) in my browser.

Screenshot 2022-04-19 at 16 36 29

borkdude avatar Apr 19 '22 14:04 borkdude

Also the server isn't really hit with these urls it seems, else exception would be logged.

borkdude avatar Apr 19 '22 14:04 borkdude

When I save the CSS locally as foo.css and make an index.html like:

<html>
  <head>
    <link rel="stylesheet" href="foo.css">
  </head>
  <p style="font-family: 'Fira Code';">Hello</p>
</html>

then I'm able to see an error. Why am I not seeing a single error in the clerk context?

borkdude avatar Apr 19 '22 15:04 borkdude

When I insert that HTML in in view.clj, I see no 404:

Screenshot 2022-04-19 at 17 31 46

borkdude avatar Apr 19 '22 15:04 borkdude

Progress:

When I rename assets to xassets and put this in the case:

"xassets" {:status 404}

then I see that the fonts are being requested and a 404 in the browser:

Screenshot 2022-04-19 at 18 14 49

borkdude avatar Apr 19 '22 16:04 borkdude

When correcting xassets back to assets the 404s are gone. So it seems the fonts are being loaded. Then why am I not seeing those fonts being rendered?

Screenshot 2022-04-19 at 18 19 55

borkdude avatar Apr 19 '22 16:04 borkdude

@joe-loco @mk I finally (sorry it took me so long) found the remaining issue with those fonts: binary files were downloaded and stored as text, which explains the difference in byte size. Can you give this another go?

In the branch:

  • rm -rf .clerk/assets (to ensure you're starting from scratch)
  • clojure -X nextjournal.clerk/cache-assets! to pre-download all necessary assets (before hopping on airplane)
  • Then develop as usual, you should be able to do this without an internet connection
  • katex.min.css has been left out of scope for now

borkdude avatar Apr 25 '22 11:04 borkdude

Ping @mk @joe-loco - can you test this again? I made some improvements a few weeks ago :-D

borkdude avatar May 16 '22 18:05 borkdude

I tried this out (018bdd534cf820fe529ba65cfec912a062684ab1), and it's not working for me on Linux (Pop! OS 22.04, based on Ubuntu 22.04) with either Clojure 1.10.3 or Clojure 1.11.1:

  • Online, it gives me the error: "unusable viewer: :clerk/notebook".
  • Offline, the page is empty and I get a console error nextjournal is not defined.

Whether offline or online, Firefox is also complaining that downloadable fonts being rejected by the sanitizer. Chromium gives a similar issue, but says that "Size of decompressed WOFF 2.0 is less than compressed size"

alysbrooks avatar Jun 16 '22 21:06 alysbrooks

@alysbrooks Can you specify all steps you did to try this out? Did you run clojure -X nextjournal.clerk/cache-assets!? Which notebook were you viewing? Etc.

Thanks for trying this out!

borkdude avatar Jun 16 '22 21:06 borkdude

You're welcome! Here's what I tried:

  • Set my clerk dependency to 018bdd534cf820fe529ba65cfec912a062684ab1
  • Ran rm -rf .clerk/assets
  • Ran clojure -X nextjournal.clerk/cache-assets!

It doesn't seem to depend on the notebook. The error happens when I just call clerk/serve!, it happens on my own notebooks, and it happens on a variety of the official demo notebooks, including readme.clj and markdown.clj.

The error seems to be occurring between evaluation and the browser rendering it. I get the "clerk rendered" message in my out when I evaluate clerk/show!. Also, when I tried to open one of the official demo notebooks that I don't have the correct dependencies for, it shows a stacktrace with FileNotFoundException

I can provide my whole deps.edn if that would help?

alysbrooks avatar Jun 16 '22 21:06 alysbrooks

@alysbrooks If your code happens to be open source, you could maybe push a branch I could check out locally? That would be super helpful.

borkdude avatar Jun 16 '22 21:06 borkdude

@borkdude It's not open source because I haven't bothered to make it into a proper repository yet.

Actually, while I was trying to make a minimal reproduction, it stopped happening. A lot of the files I excluded aren't necessary for me and were leftovers from cloning the clerk repository, so it might not be an issue for my actual use.

alysbrooks avatar Jun 16 '22 22:06 alysbrooks