podsync icon indicating copy to clipboard operation
podsync copied to clipboard

Problems with filenames starting with hyphen

Open Th0masL opened this issue 4 years ago • 15 comments

Hi,

It seems like there are some problems when Podsync has to deal with files that starts with hyphen.

And actually, on Linux's point of view, it's not recommended to have files that starts with Hyphen, but because the downloaded files are named after YouTube's video ID, and because YouTube's videos ID can start with hyphen, we can face some weird problems.

time="2020-05-17T23:01:18Z" level=info msg="! downloading episode https://youtube.com/watch?v=-ABCDEF" episode_id=-ABCDEF index=0
time="2020-05-17T23:04:23Z" level=info msg="successfully downloaded file \"-ABCDEF\"" episode_id=-ABCDEF index=0
time="2020-05-18T22:47:25Z" level=error msg="failed to update feed: https://www.youtube.com/user/<USERNAME>" error="xml build failed: failed to obtain download URL for: -ABCDEF.mp3: failed to check whether file exists: stat /app/data/<USERNAME>/-ABCDEF.mp3: no such file or directory"

I guess it would make sense to add a prefix to the files that are downloaded, in order to avoid files to start with hyphen.

I was about to try to implement a fix myself, but that's the first time I try to build a Go code from source, and I can't seems to get the build succeed (more documentation is needed on how to properly build the binaries from the source code).

If someone explains how exactly do to build the binaries from the source code, then I can try to implement some sort of fix, by adding an fixed prefix to the audio files, like, for example :

podsync-ABCDEF.mp3 instead of simply -ABCDEF.mp3

Thomas

Th0masL avatar May 23 '20 00:05 Th0masL

I'll just point out that this issue does not occur for me when running podsync in docker.

I have several videos whose filenames start with hyphens, and the respective feed xml files build without any errors (reported or otherwise).

This should be solvable without adding a prefix, but I'm curious to see what @mxpv thinks.

726a67 avatar May 23 '20 01:05 726a67

Hyphens at the start should work (and I've just checked and found one in my data folder). They can theoretically cause a problem when supplied as an argument to a command because they can be falsely interpreted as command switches, but as far as I can see this does not happen in podsync.

It could help if you provided the actual feed URL and episode id, @Th0masL

dop251 avatar May 23 '20 10:05 dop251

Thanks for your answer guys !

I'm also using Podsync in Docker, using the latest image.

I'll try to reset everything (remove all the data/audio files) and destroy the container, and recreate it, and if the error happens again I'll share what are written in the logs.

On a side note, can someone explain how to build the Podsync binaries directly from the source code ?

I want to be able to build my own Podsync Docker image, and edit the Dockerfile to add some more things, but sadly I can't seems to find the correct way to build the Go code.

I can see that the Dockerfile that is provided is expecting to find a podsync binary in the main folder, so I need to find a way to generate this binary.

What are the commands required to use to build from source ?

Th0masL avatar May 23 '20 12:05 Th0masL

make build should do it. Or, if you prefer it without make, go install ./cmd/podsync

dop251 avatar May 23 '20 13:05 dop251

I've been able to generate the podsync binary using the commands you suggested, thanks.

On a side note, I had to copy the file bin/podsync to the main folder, as the Dockerfile is expecting to find the podsync file there.

But all good, I've been able to start my own container, so I'll be to contribute to the code by providing some PR if I see some problems/improvements possible :)

Regarding the original problem of filenames starting with hypen : I deleted everything (both the local storage and the podsync container) and started a new container using the same config, and everything went fine this time.

So that made me think why the issue happened at the first place, and I think I have an idea, let me explain : Originally I configured podsync to download all the videos from a specific YouTube channel, and it started to download everything. I then realized that it would be way too many videos, and that in fact I only needed some specific videos, so I :

  • stopped the existing Docker container
  • deleted all the files from the /data folder (the xml podcast file hadn't been generated yet, as some downloads were still pending)
  • implemented some filters in the config.toml file
  • restarted the container, to start downloading again, but using the new filters.

And that's when the hyphen problems happened. Basically it looks like podsync was able to re-download all the videos that were matching the new filters and that were gone from the /data folder, except the ones that were starting with an hyphen.

I assume the user is not really supposed to delete video files manually from the /data folder, but it still looks like podsync has a problem to re-download the video files that starts with an hyphen that were already downloaded before and that got deleted.

That leads me to another question : Except the xml file from the playlist (that is generated when all the videos have been downloaded) and the /data folder that contains the video files, is there any other file that keep tracks of what has been downloaded ?

To me, it looks like even when if I deleted everything that I could think of (namely the xml file and the /data folder), it looks like the Podsync container still remembered that it had downloaded some hyphen-starting files during the previous run, and was looking for it.

I saw that some documentation/issues refer to a local Database. Is the Database implemented by default ? Where is it located ?

I can see those lines when I start the container, so I assume there's indeed a database file somewhere.

time="2020-05-23T17:52:36Z" level=info msg="opening database \"db\""
time="2020-05-23T17:52:36Z" level=info msg="All 1 tables opened in 1ms\n"

Thanks for your help !

Th0masL avatar May 23 '20 18:05 Th0masL

There is, of course, a database and it's off this database the XML file is built, not off the files. What I think happened is the file starting with hyphen was downloaded and marked as such in the database, then you deleted it. The hyphen is a red herring, could have been any file.

dop251 avatar May 23 '20 22:05 dop251

Well, I agree with the fact that the files were written in the database and that Podsync didn't liked that I deleted all the files from the disk.

But I don't think it was just "luck" that the problem happened with the hyphen files, because all the other files (the ones not starting with hyphen) got re-downloaded correctly. Only the files starting with hyphen were "broken".

I even found a way to "fix" the problem, by running manually a youtube-dl command to re-download the missing hyphen file, and then I restarted the container.

The next file that Podsync was complaining about was again a file starting with an hyphen. I repeated the operation like 5 times, manually downloading the hyphen files to the /data folder and restarting the container. Every time that the container was restarting, Podsync was still complaining only about not being able to find the files that starts with hyphen. All the other files (the ones not starting with hyphen) were correctly re-downloaded automatically and present in the /data folder, so that's kinda weird that this problem would happen only with the hyphen files.

Th0masL avatar May 23 '20 23:05 Th0masL

Have you also removed the db folder?

dop251 avatar May 23 '20 23:05 dop251

No, I kept the db folder in the container, as I was not aware at that time that podsync was using a DB file

Th0masL avatar May 26 '20 18:05 Th0masL

I'm closing this issue, since it's most likely due to the fact that I haven't deleted the DB folder

Th0masL avatar Jun 15 '20 21:06 Th0masL

I have the same problem. If I add a youtube channel and one of the videos has a "-", Podsync skips it and downloads the next video. It doesn't matter if the "-" is at the beginning or not. It only happens with channels. It does not happen with playlists. You can try any channel that has a video with the "Covid-19" word like: https://www.youtube.com/user/NutritionFactsOrg

ssergioss avatar Jul 08 '20 12:07 ssergioss

I'm seeing this same issue now too, with files with hyphens in them.

Rumik avatar Aug 09 '20 06:08 Rumik

Here is another example that the hypen-starting files are not a good idea.

I'm going to recreate my podsync setup, so I'm deleting everything, and here is the result :

root@server:/data/podsync/data/MyPlaylist# rm *.mp3
rm: invalid option -- 'A'
Try 'rm ./-AHeo_dA3BB.mp3' to remove the file '-AHeo_dA3BB.mp3'.
Try 'rm --help' for more information.

Even Linux can't find the file starting with hypens :(

It would be nice to have an option that allow to add a Prefix.

That way, only people that want to enable the prefix will actually do it, and it won't break existing setups.

Th0masL avatar Dec 15 '20 23:12 Th0masL

@Th0masL, Linux definitely can remove filenames beginning with a hyphen:

rm -- -AHeo_dA3BB.mp3

726a67 avatar Dec 16 '20 00:12 726a67

I know, but I'm maintaining my statement that it would be better if the files were not starting with hyphens, as it's anyway breaking things :)

Th0masL avatar Dec 18 '20 12:12 Th0masL