File name length unchecked
The downloaded file name should never be longer than 255 bytes under Linux.
Take the terminal output below as example:
$ gallery-dl https://www.reddit.com/r/anthro/comments/f646i7/πΎπΏπ΄π½πΈπ½πΆ_π°π΄πππ·π΄ππ·πΈπ²_π·π΄π°π³ππ·πΎπ_π²πΎπΌπΌπΈπππΈπΎπ½π_get_a_v_a/
./gallery-dl/reddit/anthro/f646i7 πΎπΏπ΄π½πΈπ½πΆ π°π΄πππ·π΄πβ¦ [email protected]! ONLY 10 SLOTS AVAILABLE !.jpg
[download][warning] OSError: [Errno 36] File name too long: "./gallery-dl/reddit/anthro/f646i7 πΎπΏπ΄π½πΈπ½πΆ π°π΄πππ·π΄ππ·πΈπ² π·π΄π°π³ππ·πΎπ π²πΎπΌπΌπΈπππΈπΎπ½π! β‘οΈ get a V A P O R W A V E version of your character for only $25! if you'd like to grab a spot, PM me here, on telegram at @dhazeartt or email me at [email protected]! ONLY 10 SLOTS AVAILABLE !.jpg.part"
[download][error] Failed to download f646i7 πΎπΏπ΄π½πΈπ½πΆ π°π΄πππ·π΄ππ·πΈπ² π·π΄π°π³ππ·πΎπ π²πΎπΌπΌπΈπππΈπΎπ½π! β‘οΈ get a V A P O R W A V E version of your character for only $25! if you'd like to grab a spot, PM me here, on telegram at @dhazeartt or email me at [email protected]! ONLY 10 SLOTS AVAILABLE !.jpg
Its file name is 247 characters long, which seems acceptable, but is 359 bytes long, which is deemed too long by the kernel (on BTRFS).
A quick workaround fix I did was through a PostProcessor:
class FixFileNamePostProcessor(gallery_dl.postprocessor.common.PostProcessor):
def prepare(self, pathfmt: gallery_dl.util.PathFormat):
"""Updates file path"""
pathfmt.clean_path = FixFileNameFormatterWrapper(pathfmt.clean_path)
pathfmt.build_path()
That uses this wrapper:
class FixFileNameFormatterWrapper:
"""Wraps file name formatter for ensuring a valid file name length"""
def __init__(self, formatter: gallery_dl.util.Formatter):
self.formatter = formatter
def __call__(self, *args, **kwargs) -> str:
path = self.formatter(*args, **kwargs)
parts = list(map(fix_filename_length, Path(path).parts))
return str(Path(*parts))
That uses this function:
def fix_filename_length(filename: str) -> str:
"""Ensures a segment has a valid file name length"""
if len(filename.encode()) > 240:
extension = Path(filename).suffix
extension_bytes_length = len(extension.encode())
stem_bytes = Path(filename).stem.encode()
fixed_stem_bytes = stem_bytes[:240-extension_bytes_length]
fixed_stem = fixed_stem_bytes.decode(errors="ignore")
return fixed_stem + extension
return filename
It would be nice if MAX_PATH was also observed (4096 on Linux, 260 on Windows (up to Windows 10's 2016 update, but only if you changed an entry in registry)), but that's not an issue for me right now.
I have the same issue. Would be great to see this implemented.
I could make it work by adding this in the config (trimming the name):
"filename" : {
"" : "{title[:40]}_{subreddit}_{id}.{extension}"
},
This should be a feature. The following works for youtube-dl for example (though couldn't find it in docs):
%(title).150s: Truncates to 150 symbols.%(title).150B: Truncates to 150 bytes.
{filename[:150]} works in gallery-dl so I can't imagine why something like {filename[:150B]} should not work.
There is this explanation in https://github.com/mikf/gallery-dl/issues/873#issuecomment-656366953
There is no good general solution for the "filename length problem", which is why I haven't really tried to implement something.
But we have the [:150] symbol limiter regardless, which is not a general solution.
Looks like Linux is far from getting support for longer filenames. So for now software itself should take care of it.
| Example | Result | |
|---|---|---|
| Slicing (Bytes) | {title_ja[b3:18]} |
γγΌγ»γ―γΌ |
Oh, thanks. Also found it here https://github.com/mikf/gallery-dl/discussions/4087#discussioncomment-5977221
It's probably safe to close this issue.