tectonic icon indicating copy to clipboard operation
tectonic copied to clipboard

How to find available bundles?

Open hugobuddel opened this issue 4 years ago • 14 comments

Different bundles can be specified with -w or -b, but the documentation does not say where to find these bundles.

According to https://tectonic-typesetting.github.io/en-US/ :

The underyling “bundle” technology allows for completely reproducible document compiles. Thanks to the Dataverse Project for hosting the large LaTeX resource files!

But I can't find the previous bundles, so my documents are not reproducible :cry:. How can we find which bundles are available?

After some investigation, I've found these 3 urls:

  • https://tectonic.newton.cx/bundles/tlextras-2016.0r4/bundle.tar
  • https://tectonic.newton.cx/bundles/tlextras-2018.1r0/bundle.tar
  • https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar

The first two redirect to the Dataverse Project, and indeed can be found there:

  • https://dataverse.harvard.edu/dataverse/harvard?q=tlextras

This is enough for me now, and I'll make sure to store these bundles locally. But maybe we can formalize this a bit and add these references to the documentation.


For what's it worth, the specific problem I'm trying to solve: The TeXLive 2020.0 bundle (#669) produces a biblatex control file with a version that is newer than my biber can handle (see also #35, #53):

ERROR - Error: Found biblatex control file version 3.7, expected version 3.5.
This means that your biber (2.12) and biblatex (3.14) versions are incompatible.

I install tectonic through conda and there is no newer biber version on conda.

hugobuddel avatar Nov 20 '20 13:11 hugobuddel

Yeah, it wouldn't hurt to document this somewhere.

For the record, my current vision is to make compilation more cargo-like than rustc-like, if you will, and introduce a Tectonic.toml file defining how a document is compiled, and a Tectonic.lock file recording the various pieces of state needed to yield reproducible builds. For each document, the lockfile would record the resolved bundle URL (among other things), ensuring reproducibility even as the default bundle gets upgraded over time.

pkgw avatar Nov 20 '20 19:11 pkgw

@pkgw Why these bundles are so large(for example, the 2020.0 bundle is 2.6G huge)。

2.6G file maybe is not affordable in some countries as bandwidth is expensive.

Is there a minimal bundle that suitable for basic bootstrap tex(for example, \begine{documen} hello world! \end{Document} ) ?

faywong avatar Feb 08 '21 07:02 faywong

One nice thing about tectonic is that it is not necessary to download the bundles. Write that document and compile with tectonic and it should fetch just the packages you need. (I guess that's why they are tar files and not tar.gz files, because it would allow better indexing.)

Tectonic caches those packages, so you only have to download them once. When you include a new package in your document, it will be fetched. So tectonic as-is should be very well suited for your goals.

Downloading these bundles manually is not the normal way to use tectonic. I created this issue because I had some problems with the latest bundle, so wanted to see whether the problems disappeared with the older bundles. (Answer: well, kinda but not really and it wasn't a problem with the bundle per se.)

(Side note: if you install LaTeX through your linux distribution, you should not install the documentation. The documentation seems to be much larger than the packages themselves. Which is a good thing, but not when you are on expensive bandwidth.)

Edit: did some tests: the document you describes puts about 40 megabytes in ~/.cache/Tectonic.

hugobuddel avatar Feb 08 '21 08:02 hugobuddel

One nice thing about tectonic is that it is not necessary to download the bundles. Write that document and compile with tectonic and it should fetch just the packages you need. (I guess that's why they are tar files and not tar.gz files, because it would allow better indexing.)

I have done a test to verify this(the above conclusion doesn't apply at my side):

  1. create a doc.tex with content(doesn't use any extra package)
\documentclass{article}
\begin{document}
	\paragraph{hello, world}
\end{document} 
  1. compile it with tectonic
tectonic -X compile doc.tex
note: "version 2" Tectonic command-line interface activated
note: connecting to https://archive.org/services/purl/net/pkgwpub/tectonic-default
  1. as i need a proxy to reach the website, i can't download the bundle from the cli, so i try to download it from chrome browser(it prompts i need other 21h to download the 2.6G bundle) image

So at my side, the 2.6G bundle is a must to bootstrap the tectonnic for the first time.

faywong avatar Feb 09 '21 03:02 faywong

OK, that's a bummer. Tectonic.zip (7MB) is a zip file of my ~/.cache/Tectonic directory that was sufficient for me to compile that test document without an internet connection.

You could try to bypass the bundle altogether by unpacking that file and placing the contents in your ~/.cache directory, such that you have ~/.cache/Tectonic/files/00/c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf files etc. The file is actually a tar-gz-zip file (because apparently that leads to the smallest file size, and github only accepts zip files), so you first need to unzip, and then untar.

However, even the smallest change to the document can lead tectonic to fetch more files. E.g. adding a \section requires it to download extra fonts (I've included those). So using the attached tarball is not a proper solution and I don't recommend it. However, given this bootstrap you might be able to figure out yourself how to manually download and add those needed files. Because it seems the files are just sorted by their sha256sum:

~/.cache/Tectonic/files/00$ sha256sum c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf 
00c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf  c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf

So if tectonic needs a file, you could try to find that file manually yourself and then put it in the right place with the right name. I cannot really help you further, but I know the frustration of having bad internet, so maybe you can get started with this. (I'm just a tectonic user like you.)

The best way for you forward would perhaps be to figure out a way to let tectonic on the command line use the same proxy that chrome uses. But I wouldn't know how to get started with that. (Hmm, since tectonic is mostly rust, wouldn't it be cool to get it running in the browser directly? E.g. #166.)

Other than that it seems that tectonic might not be a good fit for your situation. You would probably do better by finding a local proxy with a recent texlive distribution that you can download. Maybe there are linux installation dvd's that you can order. (Or well, maybe find someone that can mail you a DVD / usb stick with the tectonic bundle on it, it is free software.)

hugobuddel avatar Feb 09 '21 08:02 hugobuddel

To dig a bit deeper on how to add files yourself. Say tectonic complains that it needs loadhyph-pl.tex (I'm not sure whether it does that though). Then you can look in ~/.cache/Tectonic/indexes/5131b19b08f5628f7a5ccfb7d408f43dc8265c9c50eeb9686c43657265a2f4e4.txt to find this line:

loadhyph-pl.tex 2764703744 1160

The first number is the start of that file in the bundle, the second number the length (+1). Add those and subtract 1 for the end: 2764703744 + 1160 - 1 = 2764703744 + 1160 - 1. Then download only that part of the bundle and pipe into the file:

curl -r 2764703744-2764704903 https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar > loadhyph-pl.tex

(Adapt curl to use your proxy, I don't know how to do that. Maybe you can even do it in chrome directly or with a plugin, I don't know.)

Then use sha256sum to find the directory and file name you should use:

$ sha256sum loadhyph-pl.tex 
00c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf  loadhyph-pl.tex

Take the first to characters for the directory, and the rest as the filename, so loadhyph-pl.tex would become

~/.cache/Tectonic/files/00/c1e9c387b218901ebe736f6b06f8505669ed714934482d9c08cfc6f74c4caf

Hope you can use this information to your advantage.

hugobuddel avatar Feb 09 '21 08:02 hugobuddel

Tectonic does not directly tell you what files you need, but you can still figure it out from the logs.

E.g. adding a \tiny to the document will give this error without internet:

$ tectonic --keep-logs test.tex 
note: this is a BETA release; ask questions and report bugs at https://tectonic.newton.cx/
Running TeX ...
note: downloading SHA256SUM
warning: failure requesting "SHA256SUM" from network
caused by: https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar: error trying to connect: failed to lookup address information: Temporary failure in name resolution
...
note: connecting to https://archive.org/services/purl/net/pkgwpub/tectonic-default
error: test.tex:3: Font TU/lmr/m/n/5=[lmroman5-regular]:mapping=tex-text; at 5.0pt not loadable: Metric (TFM) file or installed font not found
Writing `test.log` (2.04 KiB)
error: something bad happened inside TeX; its output follows:

===============================================================================
(test.tex
LaTeX2e <2020-02-02> patch level 5
L3 programming layer <2020-03-06> (article.cls
Document Class: article 2019/12/20 v1.4l Standard LaTeX document class
(size10.clo)) (l3backend-xdvipdfmx.def)
No file test.aux.
(ts1cmr.fd)
! Font TU/lmr/m/n/5=[lmroman5-regular]:mapping=tex-text; at 5.0pt not loadable:
 Metric (TFM) file or installed font not found.
<to be read again> 
                   relax 
l.3 \tiny
         
No pages of output.
Transcript written on test.log.
===============================================================================
error: the TeX engine had an unrecoverable error
caused by: halted on potentially-recoverable error as specified

From this you can extract that it is looking for lmroman5-regular, and lmroman5-regular.otf is in the index file, so that's the file you need. Hope this is useful.

hugobuddel avatar Feb 09 '21 08:02 hugobuddel

Fortunately i can access to https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar without any proxy and bootstrap the tectronic(Suppose it's the case, i will verify it actually when at home :)). Also many thanks to @hugobuddel for providing me some many details on about how tectonic bundle works.

I think the index file works like meta data(just like a map in real life or meta package list of rpm/deb repo for redhat/debian linux), and its size is more acceptable for must people(it's

-rw-r--r--@ 1 faywong staff 4.8M 2 8 16:11 indexes/5131b19b08f5628f7a5ccfb7d408f43dc8265c9c50eeb9686c43657265a2f4e4.txt

on my side), this index file should be distributed separated from the overall bundle so as to reduce the bootstrap cost of tectonic.

The tectonic also features at used as library, so the first 2.6G data download is a huge cost for solo application to integrate. In my opinion a more flexible package download strategy maybe is a good improvement.

faywong avatar Feb 09 '21 10:02 faywong

For clarity: normally tectonic does not download the full 2.6GB file. It only downloads the parts of the file it actually needs, probably using the mechanism I described, but automatically instead of manually.

From your comments it seems that https://archive.org/services/purl/net/pkgwpub/tectonic-default is blocked by your proxy, but https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar is not (they are the same). If so, then you can manually specify to use the latter:

tectonic -w https://ttassets.z13.web.core.windows.net/tlextras-2020.0r0.tar test.tex

Maybe this would be the best option for you.

@pkgw it seems that the internet archive is blocked in China. So a billion people cannot use tectonic with the default settings. Maybe it would be possible to have several default url's to fetch the packages?

hugobuddel avatar Feb 09 '21 10:02 hugobuddel

@hugobuddel It's a good point. Better than having multiple default URLs, it might be good to change to one that is simply more globally accessible. The only reason I use archive.org is that their PURL service gives me an easily reprogrammable HTTP 302 redirection, without the sysadmining having to be my responsibility. When I've looked in the past, it has been surprisingly difficult to find a such service, at least for free. But at this point I'm OK with non-free services so there are probably more options now.

(I say that this is better than having multiple default URLs because with multiple options, there are issues of making sure to update both of them at once, etc. Seeing as it's a simple redirection service I think it is probably better to have a single source of truth, even if it is a single point of failure as well.)

Anyway, this is all a bit outside of the main topic of this issue; discussion of this topic should probably go to a new one.

pkgw avatar Feb 09 '21 13:02 pkgw

For clarity: normally tectonic does not download the full 2.6GB file. It only downloads the parts of the file it actually needs, probably using the mechanism I described, but automatically instead of manually.

@hugobuddel Got it, thanks for such a good demonstrate of the bundle mechanism tectonic taken.

I say that this is better than having multiple default URLs because with multiple options, there are issues of making sure to update both of them at once, etc. Seeing as it's a simple redirection service I think it is probably better to have a single source of truth, even if it is a single point of failure as well.)

@pkgw I agree with you. multiple default URLs brings in more complicated behaviors(harder to locate when failure/version mismatch), reduce the possbility of failure but other than a root solution.

faywong avatar Feb 10 '21 02:02 faywong

Also i noticed that the CTAN archive is mirrored across China education institutes(for example tsinghua university mirror)

If the tectonic bundle can be accepted by CTAN archive, it will be mirrored and synced on more servers. Just to concern the CDN expense taken to host the bundles. :)

faywong avatar Feb 10 '21 02:02 faywong

@faywong The core hosting of the bundle files actually isn't, or at least shouldn't be, an issue at the moment. The issue here is having a reliable service that redirects a globally stable URL (embedded in the source code and distributed executables) to the latest version of those bundle files. This is basically the functionality of URL shortener services, which are of course quite common, but I've had trouble finding something that meets all of the requirements that I'd like to uphold for Tectonic: mainly that the ownership is transferrable, that I can update the redirection target, and that (ideally) the service is free.

(I've also had issues with the hosting in the past, because it can be tricky to find a place that hosts large files and supports HTTP Range requests, but I decided to bite the bullet and start paying to host on an Azure Storage service, and that should solve that problem until/unless the costs grow hugely.)

pkgw avatar Feb 12 '21 12:02 pkgw

@pkgw It would be ideal if the latex packages (.cls .sty files etc.) can be downloaded from a user specified CTAN mirror.

mirrorinf avatar Apr 25 '21 13:04 mirrorinf