RedditImageGrab
RedditImageGrab copied to clipboard
Re-Naming downloaded files
After successfully downloading a bunch of images, I noticed that the file names are altered by imgur and are just random values. Is there any possibility of renaming the files to meaningful values such as the Reddit posts' topic. This would be very useful in saving time and in organizing it. See if this feature can be added to this awesome piece of code.
Filename is actually reddit thread ID with additional modification when an album is found instead a single link. If you want filename to be taken from reddit thread title, this will be more difficult because non alphabet character. Another alternative is to let user decided the filename format.
I second the original suggestion of having filenames pulled from reddit thread titles. I understand this is difficult, but is it possible?
It's very easy to add the Reddit post title to the file name. You just need to add IDENTIFIER (the post title) to the file's name.
For example, in my version of the script I have changed
FILENAME = '%s%s%s' % (ITEM['id'], FILENUM, FILEEXT)
to
FILENAME = '%s%s%s%s%s' % (ITEM['id'], ' - ', IDENTIFIER, FILENUM, FILEEXT)
Note that the script will crash if you pull anything with non ascii characters in the title. This can be avoided by using Unicode, which is not terribly difficult, but a bit more work to implement.
Also, in case you're unaware, the thread id it saves as the file name can be used to visit the actual post by adding the id after reddit.com/
Whops, forgot one other line. (I haven't looked at this in awhile)
I also changed the following to create the IDENTIFIER and remove some characters that aren't allowed in file names.
for ITEM in ITEMS:
TOTAL += 1
to
for ITEM in ITEMS:
TOTAL += 1
IDENTIFIER = ITEM['title'].replace('/', '\'').replace('"', '\'').replace('*', '\'').replace(':', '-').replace('?', '\'').replace('|', '-').replace('\\', '\'').replace('>','\'').replace('<','\'').replace('\n','-').replace('\t','-')
i already add the option. it alsom limit the total filename to be less than 256 char.
Thanks everyone for the info, much appreciated. :)
On Wed, Aug 5, 2015 at 9:05 PM, rachmadani haryono <[email protected]
wrote:
i already add the option. it alsom limit the total filename to be less than 256 char.
— Reply to this email directly or view it on GitHub https://github.com/HoverHell/RedditImageGrab/issues/27#issuecomment-128222986 .
Just created an account to post this. This was driving me crazy.
This makes downloaded files retain their original filename.
#FILENUM (comment out FILENUM's stuff)
filename_with_no_extension_from_url = pathsplitext(URL)[0]
filename_with_no_extension_from_url = filename_with_no_extension_from_url.split('/')[-1].split('.')[0]
FILENAME = '%s%s' % (filename_with_no_extension_from_url, FILEEXT)
Please forgive my lack of coding etiquette. I'm not a coder. I know next to nothing about Python.
BE WARNED If there is an album it removes the incremental numbering if an album is present, you'll have to rely on the downloaded timestamp to sort that out locally.
Do the developers see this stuff? I'm unsure. Thanks for the original script if you're watching.
BE WARNED If there is an album it removes the incremental numbering if an album is present, you'll have to rely on the downloaded timestamp to sort that out locally.
actually the behavior is as expected. when user want only to use the name from url , image of the album will not have index.
i'm not sure if @HoverHell still working on this. you can use other fork for this
I put up a fork of RedditImageGrab. Maybe you or @HoverHell or others might see something a little useful in what I've done. You said I can use the "other fork", is that considered the one @rachmadaniHaryono has?
not necessarily mine. while other have their own fork you can choose which meet your need.