tectonic icon indicating copy to clipboard operation
tectonic copied to clipboard

[Question] What is needed out of biber?

Open saona-raimundo opened this issue 3 years ago • 6 comments
trafficstars

Following #35, in a post from 2017, it was mentioned that, to include biber support, it would be nice to have biber as a Rust crate.

Currently, biber is written in Perl and is the default tool to deal with bibliography with Unicode support in the Tex world.

Since biber is a big project with more ~4000 commits today, I wanted to ask: What is actually needed from biber to achieve the goals of tectonic with both .tex and .bib files?

(I myself am a Rust-fluent and TeX-internals-novice)

saona-raimundo avatar Feb 26 '22 19:02 saona-raimundo

I’m not quite sure how everything works as well, but my naive assumption is that not much is needed. I think

  • BibLaTeX is a file format that’s an extension of BibTeX-the-file-format with more entry types and unicode support, and
  • the program Biber is a replacement for the BibTeX-the-program that supports BibLaTeX

So if tectonic comes with its own bibtex replacement it would probably “just” have to support the BibLateX format.

But as said: Very unsure about anything here, maybe just tag and ask some biber devs.

flying-sheep avatar Mar 18 '22 09:03 flying-sheep

I'm not 100% if I understand what you're asking, so apologies if I'm not quite answering the right question. But basically, Tectonic needs the output files that biber creates when it's invoked to process a bibliography. The problem (for us) is that the output of biber is much more complex and configurable than the output of plain bibtex. I'm not fully familiar with how it works, but it has many options for sorting and filtering, and it creates a variety of output files that then are tightly integrated with the behavior of the biblatex TeX package.

The most basic invocations of biber might only do very simple things not far from what bibtex can do, but much more complicated uses are possible. You could imagine that maybe it would be feasible to provide a sort of biber-lite that can be self-contained, at the cost of not providing any of the fancy features, but that might just make things confusing for people without actually meeting many users' needs.

Do you feel like I answered your question helpfully here?

pkgw avatar Mar 22 '22 03:03 pkgw

Thank you! you surely answered my question! :)

Looking at the biber documentation it was not clear to me if a biber-lite made sense at all. If you could ask for a minimal bibliography processor (like "biber-lite"), what would be needed so that the goals of tectonic are met?

Like, for example, the same as bibtex but with unicode support? (the naive me asking this without being fully aware of how much work this entails)

saona-raimundo avatar Mar 22 '22 03:03 saona-raimundo

I'm not familiar enough with biber to say, but I'm pretty sure that its outputs are pretty different, so the implementation would presumably have to be very different as well. One of the big question marks would be the styling language — the Bibtex style format is really its own esoteric programming language. I don't know how biber's works at all, but there has to be some fancy kind of templating in there, and that kind of thing is never trivial to implement.

I think I read once that there was a variant of bibtex called bibtex8 that handled Unicode inputs. For a long time I've been meaning to check whether it would make sense and/or be feasible to update Tectonic's built-in bibtex to that instead.

pkgw avatar Mar 22 '22 04:03 pkgw

If somebody interested and want to join. This is my attempt to oxidize biber: https://github.com/burrbull/biber/tree/rust cc @plk
Although it's only concept and nothing works yet.

burrbull avatar Jul 28 '22 19:07 burrbull

I'm interested in this - I like Rust but when I naïvely looked into a Rust port, I couldn't convince myself that the Unicode CLDR support was there (but then I believe I saw that there is a full ICU port?). There is also the issue that the perl version uses a binary library "btparse" which does the rapid low-level parsing of .bib files. That would have to be the main initial focus of any port. "btparse" is ancient C and isn't ideal at all but we have to have 99% bibtex compat in the .bib parsing. When I last looked, I couldn't find any reasonable looking bibtex parsers for Rust.

plk avatar Jul 28 '22 19:07 plk

I would love for someone to write a high-quality Rust bibtex parser! I have some personal bibliography automation tools written in Python that I'd love to oxidize, and I can think of some features that I'd like to have in Tectonic that would require bibtex parsing as well.

As for ICU, google/rust_icu is the most careful-looking Rustification that I'm aware of. My understanding is that basically the ICU library represents many many many thousands of hours of careful engineering so, for the foreseeable future, the only responsible thing to do is wrap it rather than port/reimplement it. Which is a shame, since it's a pain to depend on ICU as an external library, but there we are.

pkgw avatar Sep 10 '22 20:09 pkgw

I don't see a problem with relying on ICU - it's basically an industry standard these days and many applications that need Unicode just link to ICU.

plk avatar Sep 10 '22 20:09 plk

Yeah, definitely, it's just a perennial hassle when I try to produce static builds or builds for unusual architectures.

pkgw avatar Sep 10 '22 20:09 pkgw

I would love for someone to write a high-quality Rust bibtex parser!

Have you seen https://crates.io/crates/biblatex

burrbull avatar Sep 11 '22 04:09 burrbull

@burrbull No! Good to know about that.

Oh, I'm going to convert this into a Discussion, now that we have that option.

pkgw avatar Sep 11 '22 13:09 pkgw