m4b-tool icon indicating copy to clipboard operation
m4b-tool copied to clipboard

feature request: minimum chapter length

Open gazpachoking opened this issue 4 years ago • 6 comments

I have had a few books which have a large number of very short chapters (like, a few seconds each. Things like saying 'chapter 1' is it's own chapter.) My watch becomes unusable to play the book when there are so many files created, so I had to resort to specifying a fixed chapter length to cut up the book into more reasonable chunks. A --min-chapter-length would be great, so that I could come up with one setting to run on every book to split into reasonable sized files, while still avoiding chopping chapters mid-sentence/word.

gazpachoking avatar Aug 23 '21 17:08 gazpachoking

m4b-tool tries to keep tracks as chapters by default. So if there are lots of very small tracks, it might lead to this issue.

Actually, there is a possibility, to set chapters within a time window, but it still tries to keep track chapters. The reason for this is, that in nearly all cases the original track chapters are there for a reason. Often they are shorter than my personal minimum limit (300s / 5 mins) and providing a minimum would shift later chapter marks to a position, where they don't belong.

I experimented with that, but I think it should be possible to provide an extra option like --ignore-track-markers to prevent preferring track-chapters over minimum length.

So what you could try is export the chapters after merge:

m4b-tool meta my-book.m4b --export-chapters

remove the unwanted chapters from .chapters.txt manually and then

m4b-tool meta my-book.m4b --import-chapters

Feature request:

  • [ ] add new option --ignore-track-markers

sandreas avatar Aug 24 '21 04:08 sandreas

Actually, there is a possibility, to set chapters within a time window, but it still tries to keep track chapters.

This is exactly my desired behavior. I want to set --min-chapter-length and --max-chapter-length. The issue is that there is currently no --min-chapter-length option.

I experimented with that, but I think it should be possible to provide an extra option like --ignore-track-markers to prevent preferring track-chapters over minimum length.

Are you saying it would ignore the built in chapters entirely with the --ignore-track-markers flag? That would work, and would be a bit better than just the fixed length chapters for me, but I think it would be nicer for it to work more like --max-chapter-length, where it used the built in chapter if possible, but in the case where the chapter was shorter than your minimum defined length, it only ignored the current chapter marker, and looked for the next chapter marker.

gazpachoking avatar Aug 24 '21 04:08 gazpachoking

A desired chapter length window is already possible by providing 2 values to --max-chapter-length=300,900, where 300 is the desired length and 900 is the max length. The desired length is the length you'd prefer, if there is NO track chapter before. So if there would be an --ignore-track-chapters option, the desired length would automatically become the minimum length, whereas the chapters are merged as long as the minimum length is not met.

I think this would be the exact behaviour you've been asking for.

sandreas avatar Aug 24 '21 04:08 sandreas

Ahh, I think you are right, that is what I'm looking for. I guess the terminology of ignore-track-chapters is confusing to me, because with your suggested solution they would still be used when the chapter length falls between the two numbers supplied to max-chapter-length?

gazpachoking avatar Aug 24 '21 04:08 gazpachoking

I guess the terminology of ignore-track-chapters is confusing

Yes, it's just a working title. Maybe I'll call it --minimum-chapter-length and merge the option values internally. This would be more reasonable and easier to understand.

sandreas avatar Aug 24 '21 04:08 sandreas

Sounds perfect. 👍 I'm happy with any name of options as long as I can figure out how to combine them to get what I'm after. Hoping to set up some automation to whip all my books into shape so they are playable by my watch, but still easy to sync back and forth to my other players. Thanks for the great app!

gazpachoking avatar Aug 24 '21 04:08 gazpachoking

@gazpachoking Hey, thanks for sponsoring me. So Did you solve the --minimum-chapter-length issue with the provided information?

Otherwise maybe I can do something here to support you, or maybe a little code extension is required.

NOTE: I already implemented something like this for myself, but I noticed, that the if the track chapters length is ignored, that the REAL chapter marks may get skipped pretty often in favour of a longer AUTO chapter. This solution was not very appropriate and I discarded it. The only way I could solve this, was export chapters.txt, walk through the chapters manually, listen to the marker positions, export the REAL chapter markers to a chapters.txt and then rerun the m4b-tool merge --max-chapter-length=300,900 ....

Since this is a lot of work, it's not a real solution to your problem - but unfortunately there is no possibility to "auto-detect" the REAL chapters. I can guess it, if I have the positions from an ebook or something, but full automation is very very hard to do.

But if you would like to have it, I could implement a --minimum-chapter-length pretty easy, since I already had it.

sandreas avatar Mar 12 '23 05:03 sandreas

@sandreas No problem. Just a small thank you for a really useful piece of software.

I have a fairly good automated workflow now which I use no matter the source of my book. It goes sorta like this:

m4b-tool merge --max-chapter-length
m4b-tool meta --export-chapters
# run an external script that removes short chapters from the chapters.txt if there are an excessive number of chapters in the book
# (garmin watch book player gets really slow when you load a book with a whole crapton of chapters)
m4b-tool meta --import-chapters
m4b-tool split --max-chapter-length  # for some reason the max-chapter-length doesn't always work if it's only on the initial merge or this split. Can't remember the exact configuration that's a problem, so I just call it both times.

The only problem I've had is a couple books that end up as one big file after this workflow despite the max-chapter-length, I'm not totally certain what causes that. I haven't looked in to it deeply enough to make a bug report. It works on most everything though, so I'm not too concerned.

Thanks again!

gazpachoking avatar Mar 12 '23 18:03 gazpachoking

It goes sorta like this:

Looks interesting. The export and import could probably replaced with tone and it's custom taggers feature. Would you mind showing me the script that removes the short chapters? Just interested :-)

sandreas avatar Mar 12 '23 20:03 sandreas

Yeah, here's my remove short chapters script. Actually told ChatGPT to write it at first because bash sucks, then kept tweaking it till it actually did what I wanted. 😝

#!/bin/bash

# Check if a file name was passed as the first argument
if [ $# -lt 1 ]; then
  echo "Error: Missing file name argument."
  echo "Usage: $0 <file_name>"
  exit 1
fi

# Store the file name passed as the first argument
file_name=$1

# Get the number of lines in the file
line_count=$(wc -l < "$file_name")

# Abort the script if the line count is less than 75
if [ $line_count -lt 75 ]; then
  echo "Not editing chapters. File contains less than 75 lines ($line_count)."
  exit 0
fi

# Create a temporary file to store the filtered chapters
temp_file=$(mktemp)

# Initialize the last position to negative so the first line is always kept (00:00:00)
last_position=-100

# Initialize the line number as 1
line_number=1

# Loop through each line in the specified file
while read line; do

  if [ "${line:0:1}" == "#" ]; then
    echo "$line" >> "$temp_file"
    continue
  fi
  # Extract the chapter position from the line
  position=$(echo $line | awk '{print $1}')

  # Parse the hours, minutes, and seconds from the position string
  hours=$(echo $position | awk -F: '{print $1}')
  minutes=$(echo $position | awk -F: '{print $2}')
  seconds=$(echo $position | awk -F: '{print $3}')

  # Convert the hours, minutes, and seconds to total minutes
  total_minutes=$(echo "$hours * 60 + $minutes + $seconds / 60" | bc)

  # Calculate the duration of the chapter by subtracting the last position
  duration=$(echo "$total_minutes - $last_position" | bc)

  # Check if the chapter duration is greater than or equal to 30 minutes
  if [ "$duration" -ge 10 ]; then
    # Write the position and line number to the temporary file if the duration is >= 10 minutes
    echo "$position $line_number" >> "$temp_file"

    # Update the last position to the current position
    last_position=$total_minutes

    line_number=$((line_number + 1))
  fi
done < "$file_name"

# Replace the original file with the temporary file
new_line_count=$(wc -l < "$temp_file")
if [ $new_line_count -lt $line_count ]; then
  echo "Removing short chapters to limit chapter count."
  mv "$temp_file" "$file_name"
fi

gazpachoking avatar Mar 13 '23 02:03 gazpachoking

@gazpachoking Thank you for providing this. Another user reported the same issue, so it seems that you are not the only person finding this feature useful :-)

Are you interested in using tone for your chapter removal? If so, I could help you developing a custom JavaScript tagger for tone, where everyone would benefit and I won't need to create an expensive feature for m4b-tool...

I would go for a bit more sophisticated logic though, where I would allow to keep chapter names in mind. This way, you could remove repetitive chapters, that are too short, but keep the REAL chapter positions.

Example:

00:00 Intro
00:05 Vanishing window (1/3)
01:18 Vanishing window (2/3)
04:22 Vanishing window (3/3)
06:48 Yellow car (1/2)
08:24 Yellow car (2/2)

Would result in:

00:00 Intro
00:05 Vanishing window (1/3)
06:48 Yellow car (1/2)

Maybe I could also remove the unnecessary parenthesis if there is only one chapter with a similar name.

sandreas avatar Mar 28 '23 13:03 sandreas

Sure, using tone sounds like it could simplify the workflow a bit, then I wouldn't have to export->edit->import the chapters it could all be done in one. In my case it's not repetitive chapter names I'm worried about, just reducing the overall number of chapters on certain books with too many for my player. I'll have a look at the scripting in tone.

gazpachoking avatar Mar 29 '23 16:03 gazpachoking

@gazpachoking As a start you could take a look at this script: https://github.com/sandreas/tone/issues/40

sandreas avatar Mar 29 '23 16:03 sandreas

Fixed in the latest code.

minimum chapter length in seconds - its also possible to provide chapter indexes to keep although they are shorter, e.g. --min-chapter-length=2.5[0,1,-1] to limit chapter length to 2.5 seconds and keep first, second and last chapter, even when shorter

sandreas avatar Nov 06 '23 18:11 sandreas