goutubedl icon indicating copy to clipboard operation
goutubedl copied to clipboard

Handling Multiple Videos in Links for Playlist Feature

Open glats opened this issue 2 years ago • 7 comments

I’ve been working in an app but i hit a bit of a snag. Check out these links:

https://twitter.com/elpittttt/status/1763561738115367417 https://www.instagram.com/stories/fotorock/3325306999104437560/ They both have more than one video hiding in there. So, I’m scratching my head trying to figure out how we know when we’re dealing with a multi-video situation. I mean, is this where we pull out the playlist trick?

And speaking of, could you show me how to use those PlaylistStart and PlaylistEnd options?

glats avatar Mar 14 '24 18:03 glats

Hey, sorry for slow reply, our traveling.

Would it work to check Type for playlist or multi_video? or maybe just see if Entries is not empty? then i think you should be able to iterator or find the entries you want and then use WebpageURL for an try to download it. Maybe the code in ydls to turn a yt-dlp playlist into a RSS that has enclosure urls that downloads https://github.com/wader/ydls/blob/master/internal/ydls/rss.go#L48

About plistlist end/start: if i remember correctly it's used to limit yt-dlp to only fetch metadata for some entries in a playlist, it could take forever if there are lots in a playlist in your really only want the first or so. Update: see https://github.com/wader/ydls/blob/58d25ba33adc1bd148ac4fb717990a06d5f0401a/internal/ydls/ydls.go#L428

Otherwise could you give some more info about your usecase?

wader avatar Mar 16 '24 18:03 wader

oh! now i can see things more clearly. i can work with the result before downloading the actual media. i'm going to work on that. thanks! i'll share my implementation when i'm finished

glats avatar Mar 17 '24 20:03 glats

After working a bit with the idea you gave me I started to check Entries and search WebpageURL to re-invoke goutubedl.New() with the result of WebpageURL. The problem is that in the case of the instagram story WebpageURL is the same as in all Entries. I don't know if there is another mechanism to download each video.

The usecase: In the end i want to download all the videos from the media link source

glats avatar Mar 17 '24 20:03 glats

i came up with:

package main

import (
	"context"
	"fmt"
	"io"
	"log"
	"os"
	"os/exec"

	"github.com/wader/goutubedl"
)

func main() {
	result, err := goutubedl.New(
		context.Background(),
		"https://www.instagram.com/stories/fotorock/3325306999104437560/",
	)
	if err != nil {
		log.Fatal(err)
	}

	if len(result.Info.Entries) > 0 {

		for i := range result.Info.Entries {
			index := i + 1
			downloadResult, err := result.DownloadWithOptions(context.Background(), goutubedl.DownloadOptions{
				Filter:        "best",
				PlaylistIndex: index,
			})
			if err != nil {
				log.Fatal(err)
			}
			defer downloadResult.Close()
			f, err := os.Create(fmt.Sprintf("%d-file.mp4", index))
			if err != nil {
				log.Fatal(err)
			}
			defer f.Close()
			w, err := io.Copy(f, downloadResult)
			if err != nil {
				log.Fatal(err)
			}
			fmt.Printf("written %d\n", w)
		}

		return
	}

	downloadResult, err := result.Download(context.Background(), "")
	if err != nil {
		log.Fatal(err)
	}
	defer downloadResult.Close()
	f, err := os.Create("file.mp4")
	if err != nil {
		log.Fatal(err)
	}
	defer f.Close()
	w, err := io.Copy(f, downloadResult)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("written %d\n", w)
}

i don't know if the best option.

glats avatar Mar 17 '24 21:03 glats

Aha ok, hmm does it help if you set Type: goutubedl.TypePlaylist in the options to goutubedl.New? btw i can recommend playing around with yt-dlp -J and jq (it's the JSON goutubedl uses), ex:

$ yt-dlp -J --playlist-end 3 --yes-playlist https://www.youtube.com/@DrBecky/videos | jq '.entries[] | {title, webpage_url}'
{
  "title": "Can gravitational waves INTERFERE with each other?",
  "webpage_url": "https://www.youtube.com/watch?v=zFGnOqrq8j4"
}
{
  "title": "TWO Earth-like planets found in the HABITABLE ZONE (just 6% bigger than Earth!)",
  "webpage_url": "https://www.youtube.com/watch?v=wumwzh5m8nw"
}
{
  "title": "JWST discovers exoplanets orbiting DEAD STARS",
  "webpage_url": "https://www.youtube.com/watch?v=rur6Mu5sqmU"
}

Does it work for other sites or just instagram that give the same URLs for entries? if so might be worth see if there is some issue files about it or maybe digg around in the extractor code to see what is going on.

wader avatar Mar 17 '24 23:03 wader

as i see only happens with instagram:

yt-dlp  --cookies-from-browser chrome -J --yes-playlist https://www.instagram.com/stories/fotorock/3325306999104437560/ | jq '.entries[] | {title, webpage_url}'
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}
{
  "title": "Video by None",
  "webpage_url": "https://www.instagram.com/stories/fotorock/3325306999104437560/"
}

however: what do you think about my implementation? do you have another approach? i'd glad to hear it.

glats avatar Mar 18 '24 02:03 glats

If you expect that the URL will be a playlist i think i would probably set type to playlist, have some vague memory of that being needed in some cases, maybe when the URL can be both?

Haven't used PlaylistIndex myself, should be fine. Not sure if there will be any difference somehow, less/more metadata requests, compared to creating a new goutubedl instance for each entry using the webpage url or not, maybe try?

wader avatar Mar 18 '24 08:03 wader