greg Allow Greg Built-In Handler to Use a Filename Template

This PR partially address #83 (it does not attempt to add a template option for the entry ID) by converting the file_to_tag option to filename_template and using it in download_handler to set the placeholders.filename.

Also it adds the following minor things:

get_fullpath to create the full path, preventing the need to call os.join whenever a fullpath is needed.
get_extension to get the extension of the filename.
get_file_basename to get the filename without the extension.
Add extras_requires section for optional dependencies

Mar 16 '21 12:03 eugene-davis

Hi, thanks for this! This would break current behavior; could you please provide a bit of rationale as to the advantages of moving to filename_template?

Mar 16 '21 12:03 manolomartinez

Hi @manolomartinez, I got a bit caught up in making the name for the new functionality make sense. How about I make sure that the tagging section checks for file_to_tag but gets over-written by filename_template if it exists?

Mar 16 '21 12:03 eugene-davis

How about I make sure that the tagging section checks for file_to_tag but gets over-written by filename_template if it exists?

That would be best if we implement this, I think. But before that, could you please provide a use case that the PR would permit which the current greg doesn't?

Mar 16 '21 12:03 manolomartinez

The use case here is allowing configuration of filenames that the built-in download-handler uses when it actually downloads an episode. This allows customization of filenames without using an external download handler, also helping to solve issues such as in one of the comments of #83 where every episode originally has the filename audio.mp3.

The change of file_to_tag to filename_template is purely to avoid adding an option specific to how a file is named when downloaded by the built-in download-handler.

Mar 16 '21 12:03 eugene-davis

Thanks. I agree that a modicum of configurability for the built-in downloader sounds good in principle. On the other hand, I emphatically do not want to replicate much of the functionality of curl or wget from within greg. "Do one thing well" and all that.

Other things in your PR (such as the extra_requires bit) are just obviously correct, and I'll incorporate them, with thanks. But I need to think a bit more about the main functionality that you propose.

Thanks again, Eugene.

Mar 17 '21 14:03 manolomartinez

"Do one thing well" and all that.

I would suggest that here that means either removing the download functionality entirely or updating it to cleanly handle feeds with the filename repeating - appending underscores to the end to handle a relatively common behavior in feeds doesn't really fit that unfortunately.

For some further details of my use case and why I avoided calling out to an external client, I packaged this up in a distroless/python3 container, which makes using typical c-based download clients like curl or wget rather more difficult.

Mar 26 '21 11:03 eugene-davis

Thanks for the explanation about your use case, Eugene. I think I agree that the situation right now is kind of unstable; we have a vanilla downloader, but for anything a bit more involved (not just repeating filenames, but also following redirects, etc) I guess I expected users to resort to an external program.

If we want to have a bit more functionality with the in-built downloader, though, I think we should think it a bit more carefully, and perhaps use an external library, such as requests.

Mar 28 '21 10:03 manolomartinez

Just short remark from me: Extending the downloading capabilities would be nice, but not sure whether it would be worth it.

@eugene-davis: i let greg run in a python:3-alpine docker container where i installed with pip greg stagger and youtube-dl. somehow i also did add ffmpeg per apk add but not sure why. Anyhow, this is still a small container, so adding wget would not be a problem if necessary.

@manolomartinez: one feature which i am lacking in greg is a kind of "download manager". I have the issue, that some feeds e.g. coming over youtube do use the feature like live and planned release in future. As I don't like live streams to be downloaded, that i just skip by youtube-dl itself, no big deal. When it commes to planned release in future - that would block the whole download pipeline. Therefore i wrote a crude hacky but since over a year quite functional python script which just tries to perform download tasks from a file. The tasks are put there before by greg. This works for me quite well, even if a download "in the middle" is shortly not working. I let greg run with that script unattended in a docker container in background and guess every hour poll the rss feeds and perform one run through of the download backlog.

Mar 28 '21 12:03 AlmightyFrog

@AlmightyFrog i didn't quite get the feature you describe. If it is unrelated to this PR, I'd encourage you to open a new issue so that you can explain this better to me.

Mar 29 '21 11:03 manolomartinez

Thanks for the explanation about your use case, Eugene. I think I agree that the situation right now is kind of unstable; we have a vanilla downloader, but for anything a bit more involved (not just repeating filenames, but also following redirects, etc) I guess I expected users to resort to an external program.

If we want to have a bit more functionality with the in-built downloader, though, I think we should think it a bit more carefully, and perhaps use an external library, such as requests.

I'd be happy to look at moving it to the requests library in the future when I find some time. It shouldn't be too much effort to add, and could be left as an optional dependency as well I suspect. Feel free to close this PR if you would like, as I am not sure when I will do the update.

May 29 '21 18:05 eugene-davis