podcast-dl
podcast-dl copied to clipboard
--episode-template regex options
I'd love to see an option that lets you use regular expressions to parse out elements of the episode title, and then create an episode-template based on the selected regex capture-group.
This idea comes out of the need to handle certain podcast feeds, such as the Command Control Power podcast, which appears to limit access to only the 100 most recent episodes. When I specify an --output-template using {{episode_num}}-{{title}}_{{release_date}} as the template, I end up getting filenames that look like this:
Original episode title:
Best Of CCP - 467: Interview with Brian Best from BestMacs and Mac-MSP Gruntwork
Downloaded file name:
_0100-Best Of CCP - 467_ Interview with Brian Best from BestMacs and Mac-MSP Gruntwork_20240528_.mp3
So in this case the episode_num is actually incorrect because while this episode may be the 100th episode listed in the feed, it's definitely not the actual episode number itself!
To handle edge-case feeds like this (and also for more granular control over file naming), it would be nice if you could use a regular expression to parse the current episode title, and then map the regex capture groups to the actual podcast-dl episode_template keywords.
For example. If I could "pre-filter" the episode title using the regex \d+(?=:.*), I could extract the actual episode number from the episode title (the number that appears before the first colon character in the title name), and then use a special keyword like episode_num_1 to tell the template to use the value from regex capture group \1 as the episode number.
Hey! Thanks for taking the time to explain the issue in excellent detail.
I think this is doable and could be quite powerful! Let me noodle on this in a couple days and get back to you.
@melmatsuoka Apologies for the delay!
I've opened a PR to add episode-custom-template-options here. It required updating the option parsing library, so I've going to take some time to make sure the update didn't cause any regressions. Please let me know if you have any thoughts on the API!
npx podcast-dl --url "https://cmdctrlpwr.libsyn.com/rss" --episode-custom-template-options "\d+(?=:.*)" --limit 1 --episode-template "{{custom_0}}-{{title}}"
@lightpohl This is fantastic, thanks for implementing this!
The example you posted works great for parsing out the episode number embedded within the
For example, in the same cmdctrlpower RSS feed, the episode with the
593: Navigating IT's Past and Future with Tim Nyberg of The MacGuys+ will download as _ Navigating IT's Past and Future with Tim Nyberg of The MacGuys+.mp3 if I use : (.*) as the custom template option, and {{custom_0}} as the episode template.
It almost seems like the colon in the episode title is being "sanitized" into an underscore before the regex defined in episode-custom-template-options ever gets a chance to parse the title.
So if I wanted to parse out the episode number from the
(\d+): (.*) as the custom template option, and "{{custom_0}}-{{custom_1}}" as the episode template, the resulting file ends up looking like this:
593_ Navigating IT's Past and Future with Tim Nyberg of The MacGuys+-{{custom_1}}.mp3
When I would expect it to look like this:
591-Navigating IT's Past and Future with Tim Nyberg of The MacGuys+.mp3
Seems like the custom template option should operate on the raw
As a related aside, I noticed that the <itunes:title> tag in RSS feeds contains the episode titles without the episode number embedded in it, which I guess is one of Apple's requirements for including a podcast feed in the Apple Podcasts app? It would be great if podcast-dl could use that tag as one of the template options!
Hey @melmatsuoka! It took me a few tries, but I think I was able to get what you're looking for with a tweak to how the expressions are being passed in and changing up the second expression a bit. Passing regex in via the command line is a bit of a pain!
npx podcast-dl --url "https://cmdctrlpwr.libsyn.com/rss" --episode-custom-template-options "(\d+)" "(?<=: ).*" --limit 1 --episode-template "{{custom_0}}-{{custom_1}}"
> 594-Navigating Apple's Changing Ecosystem and the Future of Tech Support.mp3
Let me know if that helps!