goutubedl icon indicating copy to clipboard operation
goutubedl copied to clipboard

I want to implement adding a progress bar when downloading the file, but I'm having problems getting result.Info.FileSize

Open allanpk716 opened this issue 1 year ago • 16 comments

I want to implement adding a progress bar when downloading the file, but I'm having problems getting result.Info.FileSize.

This is my test of youtube video address: https://www.youtube.com/watch?v=MpYy6wwqxoo&ab_channel=THEFIRSTTAKE

I used this method of Demo to get the result information:

https://github.com/wader/goutubedl/blob/2ec70c51c91b02a02e4b7ce118a0a2d2f38e0cf3/cmd/example/main.go#L13

The actual situation is that I need to specify the best definition of Download in the specific download before I need to get the file size, but it is not available at present. Would you like to know how I should solve this problem?

https://github.com/wader/goutubedl/blob/2ec70c51c91b02a02e4b7ce118a0a2d2f38e0cf3/cmd/example/main.go#L17

If possible, I expect to get the size of the file to be downloaded from downloadResult so that I can control the progress bar

allanpk716 avatar Jun 12 '23 09:06 allanpk716

Hi, looking at the info json that yt-dlp provides it looks like it only sometimes know size and approximate, see output below.

Note that when using yt-dlp via goutubedl API it's probably not good to give a filter like 1+2 that will cause yt-dlp to do muxing because then it can't stream, it has to download separate parts and mux first, which probably defeats the progress bar idea in this case.

So i think in your case you might want to select a format, by iterating formats, and select one that has muxed audio and video then you also hopefully can know file size. After that you can you download that format. Otherwise you will have to do muxing yourself somehow which is quite tricky, it's what https://github.com/wader/ydls does... but then it might be hard to know file size.

Hope that helps

$ yt-dlp -J https://www.youtube.com/watch\?v\=MpYy6wwqxoo\&ab_channel\=THEFIRSTTAKE | jq '{format_id,filesize,filesize_approx,formats:(.formats | map({format_id,filesize,filesize_approx}))}'
{
  "format_id": "313+251",
  "filesize": null,
  "filesize_approx": 424837469,
  "formats": [
    {
      "format_id": "sb2",
      "filesize": null,
      "filesize_approx": null
    },
    {
      "format_id": "sb1",
      "filesize": null,
      "filesize_approx": null
    },
    {
      "format_id": "sb0",
      "filesize": null,
      "filesize_approx": null
    },
    {
      "format_id": "599",
      "filesize": 1272329,
      "filesize_approx": null
    },
    {
      "format_id": "600",
      "filesize": 1492321,
      "filesize_approx": null
    },
    {
      "format_id": "139",
      "filesize": 2015879,
      "filesize_approx": null
    },
    {
      "format_id": "249",
      "filesize": 2169167,
      "filesize_approx": null
    },
    {
      "format_id": "250",
      "filesize": 2830288,
      "filesize_approx": null
    },
    {
      "format_id": "140",
      "filesize": 5347871,
      "filesize_approx": null
    },
    {
      "format_id": "251",
      "filesize": 5519385,
      "filesize_approx": null
    },
    {
      "format_id": "17",
      "filesize": 3294191,
      "filesize_approx": null
    },
    {
      "format_id": "597",
      "filesize": 1356583,
      "filesize_approx": null
    },
    {
      "format_id": "598",
      "filesize": 948497,
      "filesize_approx": null
    },
    {
      "format_id": "394",
      "filesize": 2653083,
      "filesize_approx": null
    },
    {
      "format_id": "160",
      "filesize": 1765272,
      "filesize_approx": null
    },
    {
      "format_id": "278",
      "filesize": 3166438,
      "filesize_approx": null
    },
    {
      "format_id": "395",
      "filesize": 4513299,
      "filesize_approx": null
    },
    {
      "format_id": "133",
      "filesize": 3454630,
      "filesize_approx": null
    },
    {
      "format_id": "242",
      "filesize": 5031795,
      "filesize_approx": null
    },
    {
      "format_id": "396",
      "filesize": 8330723,
      "filesize_approx": null
    },
    {
      "format_id": "134",
      "filesize": 6156917,
      "filesize_approx": null
    },
    {
      "format_id": "18",
      "filesize": null,
      "filesize_approx": 11753998
    },
    {
      "format_id": "243",
      "filesize": 10200698,
      "filesize_approx": null
    },
    {
      "format_id": "397",
      "filesize": 14241733,
      "filesize_approx": null
    },
    {
      "format_id": "135",
      "filesize": 10596948,
      "filesize_approx": null
    },
    {
      "format_id": "244",
      "filesize": 16762054,
      "filesize_approx": null
    },
    {
      "format_id": "22",
      "filesize": null,
      "filesize_approx": 24748289
    },
    {
      "format_id": "398",
      "filesize": 27543338,
      "filesize_approx": null
    },
    {
      "format_id": "136",
      "filesize": 18861303,
      "filesize_approx": null
    },
    {
      "format_id": "247",
      "filesize": 29638844,
      "filesize_approx": null
    },
    {
      "format_id": "399",
      "filesize": 48721378,
      "filesize_approx": null
    },
    {
      "format_id": "137",
      "filesize": 54772033,
      "filesize_approx": null
    },
    {
      "format_id": "248",
      "filesize": 53387418,
      "filesize_approx": null
    },
    {
      "format_id": "400",
      "filesize": 153662759,
      "filesize_approx": null
    },
    {
      "format_id": "271",
      "filesize": 151665058,
      "filesize_approx": null
    },
    {
      "format_id": "401",
      "filesize": 308066452,
      "filesize_approx": null
    },
    {
      "format_id": "313",
      "filesize": 419318084,
      "filesize_approx": null
    }
  ]
}

wader avatar Jun 12 '23 11:06 wader

When I try infoFromURL, I force the -f best pass, and the obtained FilesizeApprox is consistent with the video size obtained from the best parameter I set in Download later.

But the code here is just a test, I know that this change is in line with my own application needs, but I have not figured out how to open the Settings to other people to use, perhaps they may not have such download and transfer requirements. Because I see yt-dlp -f parameter passing is quite complicated. And I just need the best.

image

image

If I want to change it according to my requirements, I will pass the filter parameter through Options when New, and do not pass filter when Download. This will appear incompatible with the old version of the use of methods, I feel not very good, do not know what you have good ideas?

allanpk716 avatar Jun 12 '23 11:06 allanpk716

Aha now i see. Yes the current gotubedl API is designed that you download the info JSOn using New and then you inspect that .Formats array and pick id(s) you want and use Download(id) and you don't really care about the "root" format (that i "best" by default i guess?).

I'm wondering if we could add some additional methods to support that. But how does Download behave for you when using "best" which could end up being a combination audio+video id:s that needs to be muxed?

wader avatar Jun 12 '23 12:06 wader

I'm wondering if we could add some additional methods to support that. But how does Download behave for you when using "best" which could end up being a combination audio+video id:s that needs to be muxed?

I temporarily modify the code to meet my needs to use. I hope you can come up with a good solution, and then you can mention a new solution in this issue.

allanpk716 avatar Jun 13 '23 03:06 allanpk716

Thinking how to do make it nice and not break the current API, maybe have a NewWithFilter but then it's a bit confusing that Download takes a filter argument, should have been called DownloadWithFilter. Could make it so that empty filter string means use default but is a bit ugly also.

BTW about combining formats like 1+2 when using stdout (like goutubedl does), it seems to work much better now then it used to think, but i found this in https://github.com/yt-dlp/yt-dlp README

Similarly, if ffmpeg is unavailable, or if you use yt-dlp to stream to stdout (-o -), the default becomes -f best/bestvideo+bestaudio.

So using goutubedl vs yt-dlp CLI (without -o -) might choose different formats.

wader avatar Jun 13 '23 09:06 wader

In fact, when New is a definite Download task, then the parameter passing of Filter can be passed from Download to New. This way Filter can be passed as a goutubedl.Options{}, for New is compatible with the previous interface, and as you mentioned, Download is passed with Filter empty which is a bit weird. But at least it's compatible.

Otherwise, it may be necessary to find the format parsing part from the yt-dlp code and rewrite it to golang, which can also be satisfied if parameters such as -f best are specifically linked to elements of the formats. But the feeling is complicated.

allanpk716 avatar Jun 13 '23 09:06 allanpk716

In fact, when New is a definite Download task, then the parameter passing of Filter can be passed from Download to New. This way Filter can be passed as a goutubedl.Options{}, for New is compatible with the previous interface, and as you mentioned, Download is passed with Filter empty which is a bit weird. But at least it's compatible.

Of course, good idea, passing filter as an Option i like.

Not sure i follow what you mean by "then the parameter passing of Filter can be passed from Download to New"? the other way around? New can pass filter via option to Download and be used if the filter argument is empty string?

Otherwise, it may be necessary to find the format parsing part from the yt-dlp code and rewrite it to golang, which can also be satisfied if parameters such as -f best are specifically linked to elements of the formats. But the feeling is complicated.

Yes agree, i would like to keep goutubedl as thin as possible and let yt-dlp do the work.

Would like try do a PR for the filter option changes? for that some kind of test would be good also i think.

wader avatar Jun 13 '23 10:06 wader

Not sure i follow what you mean by "then the parameter passing of Filter can be passed from Download to New"? the other way around? New can pass filter via option to Download and be used if the filter argument is empty string?

The tentative idea is that if the Filter argument is passed in the Options of 'New', then when 'Download' executes, if the default filter is passed empty, then the Filter argument passed by 'New' is used here. If the 'filter' passed in has a value, then the current filter is used.

Here is the implementation of this feature. https://github.com/wader/goutubedl/commit/26068dec987bc64ce5bccdb5e359979f66307ff2

As for adding corresponding test cases mentioned by you, I have thought about it for a while. If I were to improve the test cases, I would separate the parameter concatenation function of yt-dlp function in the following screenshot into a function that can be covered by unit tests.

This will change a lot, and it is better to add unit tests in the way you want.

image

image

allanpk716 avatar Jun 13 '23 11:06 allanpk716

👍 yeap refactor to deduplicate the argument code seems resonable to me. Give a try, maybe add some tests and send a PR and we continue from there.

wader avatar Jun 13 '23 13:06 wader

@allanpk716 hey! Hope all is good. Did you make any progress on this?

wader avatar Jun 26 '23 11:06 wader

Recently work is busy, not much time to follow up on these, it is estimated that 1 to 2 months later can be organized to submit up 😂

allanpk716 avatar Aug 15 '23 08:08 allanpk716

@allanpk716 no worries! maybe @nonoo has time and motivation to figure out the optional filter thing?

wader avatar Aug 15 '23 08:08 wader

yt-dlp won't give you any filesize if source needs to be merged (has video and audio in separate streams (ex. Facebook videos)). Filter is set to "best" by default, so no -f is needed if you want to select the best stream. That's why I sent the PR https://github.com/wader/goutubedl/pull/154

nonoo avatar Aug 15 '23 08:08 nonoo

Trying to remember and understand again what this issue was actually about. As i understand it now:

This issue: specify filter to New to make yt-dlp select and set the "root" format in the info JSON to something and then be able to (maybe?) be able to get file sizes from that?

#154 is about skip filter when downloading to get default ("bestvideo*+bestaudio/best" seem to be default) to download best separate streams with optional audio then fallback to best already possibly muxed

So i was confused and #154 and this issue is about different things.

(please correct me if i'm wrong)

wader avatar Aug 15 '23 09:08 wader

https://github.com/wader/goutubedl/pull/154 is a different thing. Adds support for 2 new options and makes the -f argument optional (as it should be).

nonoo avatar Aug 15 '23 09:08 nonoo

👍

wader avatar Aug 15 '23 09:08 wader