[Feature Request] summarizing file formats
It would be great if a book is collected in multiple file formats that it would show up only once and offer the different file formats as dropdown and in the details section. :)
P.S. Thanks for this awesome piece of codeart! It's a oure joy to use it!
Thanks for the suggestion! This is a great idea and would definitely improve usability. However, it’s a bit tricky to implement reliably since we currently have no guaranteed way to know if multiple files belong to the same book, unless they have the exact same filename with different extensions.
If you have any thoughts on how we could reliably match such files, feel free to share!
I guess Calibre solved this via sorting same books into one folder. A thing BookLore is not doing. It's definitly a bit tricky through the topic that there are a lot of books with the same name. As an idea (without hasseling with the file system) you can choose the ISBN number as match. That implies that we need to scrape the data for the books but is unique isn't it?
See here as well: #221.
@adityachandelgit @esclear @werwolfby and all others who are interested in this topic, what do you think of the following approach?
In short
- Different book formats have the same file name
- Additional files relating to a book are stored in a folder with the same name as the book's file(s)
In detail
- Only keep one type of library
- Treat files (.pdf, .epub, .cbz, ...) with the same file name (within the same folder?) as being the same book
- Consider the additional files for a book to be in a folder with the same name as the book's file name - placed in the same folder as the book's file(s)
- Allow multiple books (of different types) to be merged
- Allow to choose which book the metadata should be taken from
- As a bonus: allow to adjust the metadata before applying it
- Give all the files the same file name as the book from which the metadata was taken
- In the "Book Details" view, allow to upload a file (of a different file type) as an "alternative format"
- Give this file the same name as the file name of the book it is attached to
- In the "Book Details" view, allow to upload files as "additional files"
- Put the files in a folder with the same name as the book's file name (within the same folder)
- Keep the file names
- As a bonus: allow to rename the files in Booklore
- In the "Book Details" view
- Show all available formats under "File Type"
- For "Read" and "Quick Send" allow to chose the file type
- For "Download" allow to chose from the available file types and the additional files (and maybe "all"?)
- When changes are made to the metadata:
- Write the metadata to the book file(s) (if enabled in the settings)
- Rename (and move, if necessary) the book file(s) and the folder with the additional files (if enabled in the settings)
- When a book is moved to another library, all files as well as the folder for the additional files are moved
Drawbacks of this approach
- The two rules mentioned above are enforced for all users (although minimal and completely acceptable IMHO 😝):
- If you want to put each book into its own folder (as done in calibre), you will end up with an additional folder for the extra files
- A book cannot have multiple "alternative formats" of the same file format - the only reason I can think of for wanting this is to have high-resolution and low-resolution versions of the book
Edit: Restructured my lengthy discourse a little. Edit 2: Added an additional point above
My proposal in #1145 focuses on doing all of this in the database using (and slightly adjusting) the tables that are already there:
- Currently, the main file for a book is stored in the
bookstable, in thefile_namecolumn. - Additional files are stored in the
book_additional_filetable.
My concrete proposal is this:
- Remove
file_name,file_sub_path, andbook_typefrom thebooktable. - Rename
book_additional_filetobook_file, as well asadditional_file_typetofile_type. - Remove the
file_typeofALTERNATE_FORMAT, and allow other formats as well. In the end this may just be an "is a book" flag, that's true for ebooks/pdfs and false for any supplementary files. - Fix the parts of the front and back end that relied on the main book file (in order of the difficulty I expect this to be).
- Simple things like the "File Type" field on the "book details" page
- Change the behavior of the "Read" and "Download" buttons to allow choosing the file
- This might require small adjustments to the API endpoints that provide the files, in order to be able to select the concrete format to deliver.
- Change the upload form to be able to handle additional formats (focusing on the one on the book page, but being able to group uploads for processing would also be good)
- Modify templates to operate on the
book_filetable - A way to merge multiple (formats of the same) book(s) into one
Note that I didn't make any assumptions or enforce any restrictions as to what files are called.
I think that would best be left to File Naming Patterns at the discretion of the user.
Personally, I'm already using {authors}/<{series}/><{seriesIndex} - >{title}/{title}, which puts multiple files of the same book into the same directory, and with the same name.
Ideally, there would be some sort of duplicate detection, especially for an initial import, but I don't think that this would be required to land this feature, which would already be a major improvement without duplicate detection. In fact, if your books are already properly tagged, you should be able to sort them by title and then have different formats of the same book next to each other, ready to merge them.
I think there's some details left to be resolved, eg how one should deal with OPDS, but a first iteration here would be to just serve all of the files, an improvement on that might be to define a ranking of formats and serve the best format available.
Technically, we could also ask whether we'd want to store different read processes for different formats of the same book, but I don't see how this would be sensible.
@esclear Where would you store supplementary files when uploading them via booklore?
I think that would best be left to File Naming Patterns at the discretion of the user.
Yes, that's how I see it as well. The "File Naming Patterns" will lead to different book formats have the same file name and location anyway.
how one should deal with OPDS
I'll have a look at how this is done in Calibre, CWA etc.
OPDS by CWA returns the following for a book with two formats (see right at the bottom):
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/terms/" xmlns:dcterms="http://purl.org/dc/terms/">
<icon>/static/favicon.ico</icon>
<id>urn:uuid:2853dacf-ed79-42f5-8e8a-a7bb3d1ae6a2</id>
<updated>2025-11-09T15:12:24+00:00</updated>
<link rel="self"
href="/opds/author/455?"
type="application/atom+xml;profile=opds-catalog;type=feed;kind=navigation"/>
<link rel="start"
href="/opds"
type="application/atom+xml;profile=opds-catalog;type=feed;kind=navigation"/>
<link rel="up"
href="/opds"
type="application/atom+xml;profile=opds-catalog;type=feed;kind=navigation"/>
<link rel="search"
href="/opds/osd"
type="application/opensearchdescription+xml"/>
<link type="application/atom+xml" rel="search" title="Search" href="/opds/search/{searchTerms}" />
<title>Calibre-Web</title>
<author>
<name>Calibre-Web</name>
<uri>https://github.com/crocodilestick/calibre-web-automated</uri>
</author>
<entry>
<title>The TCP/IP Guide</title>
<id>urn:uuid:1d0ed31a-da00-4ad0-977c-756abcc95975</id>
<updated>2022-08-05T18:39:50+00:00</updated>
<author>
<name>Charles M. Kozierok</name>
</author>
<publisher>
<name>No Starch Press</name>
</publisher>
<published>2005-10-03T22:00:00+00:00</published>
<dcterms:language>eng</dcterms:language>
<category scheme="http://www.bisg.org/standards/bisac_subject/index.html"
term="_Sachbücher"
label="_Sachbücher"/>
<category scheme="http://www.bisg.org/standards/bisac_subject/index.html"
term="Informatik"
label="Informatik"/>
<summary>The TCP/IP Guide is both an encyclopedic and comprehensible guide to the TCP/IP protocol suite. Its personal, easy-going writing style lets anyone understand the dozens of protocols and technologies that run the Internet, with full coverage of PPP, ARP, IP, IPv6, IP NAT, IPSec, Mobile IP, ICMP, RIP, BGP, TCP, UDP, DNS, DHCP, SNMP, FTP, SMTP, NNTP, HTTP, Telnet and much more.</summary>
<link type="image/jpeg" href="/opds/cover/542" rel="http://opds-spec.org/image"/>
<link type="image/jpeg" href="/opds/cover/542" rel="http://opds-spec.org/image/thumbnail"/>
<link rel="http://opds-spec.org/acquisition" href="/opds/download/542/epub/"
length="16088827" mtime="2022-08-05T18:39:50+00:00" type="application/epub+zip"/>
<link rel="http://opds-spec.org/acquisition" href="/opds/download/542/pdf/"
length="56438017" mtime="2022-08-05T18:39:50+00:00" type="application/pdf"/>
</entry>
</feed>
Where would you store supplementary files when uploading them via booklore?
In the same table. There is/needs to be a column for the file format / book type anyways. If I were to design this, then I'd have NULL be a supplementary file, probably.
I began implementing all of this, starting at the database (migrations) and the backend code. However, I decided to migrate my library to another tool, so I'll not pursue this further.
In the same table. There is/needs to be a column for the file format / book type anyways. If I were to design this, then I'd have NULL be a supplementary file, probably.
I mean, where do you store the actual file(s) on the file system? Books are stored according to the "File Naming Pattern", but where should auxiliary files be stored?
I began implementing all of this, starting at the database (migrations) and the backend code. However, I decided to migrate my library to another tool, so I'll not pursue this further.
What did you end up using?