KToolBox
KToolBox copied to clipboard
Bug = Certain images are not downloaded, and do not give an error warning
Bug:
Certain images are not downloaded, and do not give an error warning.
Examples of an artist and the affected specific posts
https://kemono.su/patreon/user/78028/post/38430890 /38525942 /43011916 /45239389 /45750902 /46390688 /46598011 /47742783 /47817522 /70093214 /77353068 /80032407 /86518011 /98257058
This next being the result of the download =
2025-01-25 22:40:24 | INFO | ktoolbox.cli - Configuration(api=APIConfiguration(scheme='https', netloc='kemono.su', statics_netloc='img.kemono.su', files_netloc='kemono.su', path='/api/v1', timeout=5.0, retry_times=3, retry_interval=2.0), downloader=DownloaderConfiguration(scheme='https', timeout=30.0, encoding='utf-8', buffer_size=20480, chunk_size=1024, temp_suffix='tmp', retry_times=10, retry_stop_never=False, retry_interval=3.0, use_bucket=False, bucket_path=WindowsPath('.ktoolbox/bucket_storage')), job=JobConfiguration(count=2, post_dirname_format='{id} {title} {published} {user} {service}', post_structure=PostStructureConfiguration(attachments=WindowsPath('.'), content_filepath=WindowsPath('content.txt')), mix_posts=False, sequential_filename=True, filename_format='[{title}]_{}', allow_list=set(), block_list=set()), logger=LoggerConfiguration(path=None, level='DEBUG', rotation='1 week'), ssl_verify=True, json_dump_indent=4, use_uvloop=True)
Command:
ktoolbox download-post ktoolbox sync-creator
Both giving the same result.
Configuration:
KTOOLBOX_JOB__POST_DIRNAME_FORMAT={id} {title} {published} {user} {service} KTOOLBOX_JOB__POST_STRUCTURE__ATTACHMENTS=./ KTOOLBOX_JOB__COUNT=2 KTOOLBOX_JOB__SEQUENTIAL_FILENAME=True KTOOLBOX_JOB__FILENAME_FORMAT=[{title}]_{} KTOOLBOX_DOWNLOADER__RETRY_TIMES=10
Platform:
- OS: Windows
- Python Version 3.11.3
- KToolBox Version 0.12.0
This situation is somewhat special. The images of these posts are actually part of the content HTML code and are not located in the attachments or presented as a file. The content is directly saved as a file, with the default name being content.txt.
API Return:
{
"id": "38430890",
"user": "78028",
"service": "patreon",
"title": "Im going to make this twilight sparkle fanart Do i make her goth?",
"content": "<p><img src=\"/e9/a1/e9a19cc0479e06634fa3f09a36be87ffe5acc190b031e7a681e3e19a579e61e0.png\"></p><p><br></p>",
"embed": {},
"shared_file": false,
"added": "2021-08-17T02:39:38.199603",
"published": "2020-06-20T09:57:29",
"edited": "2020-06-20T09:57:29",
"file": {
"name": null,
"path": null
},
"attachments": []
}
Content:
<p><img src=\"/e9/a1/e9a19cc0479e06634fa3f09a36be87ffe5acc190b031e7a681e3e19a579e61e0.png\"></p><p><br></p>
Currently, parsing the content and downloading the images within it is not supported.
Currently, parsing the
contentand downloading the images within it is not supported.
It would be really helpful when these kind of files are encountered, that a warning is shown when not possible to download. Could this be done? I hope so.
https://github.com/Ljzd-PRO/KToolBox/releases/tag/v0.17.0