KToolBox icon indicating copy to clipboard operation
KToolBox copied to clipboard

Bug = Certain images are not downloaded, and do not give an error warning

Open TheThirdComputer opened this issue 10 months ago • 3 comments

Bug:

Certain images are not downloaded, and do not give an error warning.

Examples of an artist and the affected specific posts

https://kemono.su/patreon/user/78028/post/38430890 /38525942 /43011916 /45239389 /45750902 /46390688 /46598011 /47742783 /47817522 /70093214 /77353068 /80032407 /86518011 /98257058

This next being the result of the download =

2025-01-25 22:40:24 | INFO | ktoolbox.cli - Configuration(api=APIConfiguration(scheme='https', netloc='kemono.su', statics_netloc='img.kemono.su', files_netloc='kemono.su', path='/api/v1', timeout=5.0, retry_times=3, retry_interval=2.0), downloader=DownloaderConfiguration(scheme='https', timeout=30.0, encoding='utf-8', buffer_size=20480, chunk_size=1024, temp_suffix='tmp', retry_times=10, retry_stop_never=False, retry_interval=3.0, use_bucket=False, bucket_path=WindowsPath('.ktoolbox/bucket_storage')), job=JobConfiguration(count=2, post_dirname_format='{id} {title} {published} {user} {service}', post_structure=PostStructureConfiguration(attachments=WindowsPath('.'), content_filepath=WindowsPath('content.txt')), mix_posts=False, sequential_filename=True, filename_format='[{title}]_{}', allow_list=set(), block_list=set()), logger=LoggerConfiguration(path=None, level='DEBUG', rotation='1 week'), ssl_verify=True, json_dump_indent=4, use_uvloop=True)

Command:

ktoolbox download-post ktoolbox sync-creator

Both giving the same result.

Configuration:

KTOOLBOX_JOB__POST_DIRNAME_FORMAT={id} {title} {published} {user} {service} KTOOLBOX_JOB__POST_STRUCTURE__ATTACHMENTS=./ KTOOLBOX_JOB__COUNT=2 KTOOLBOX_JOB__SEQUENTIAL_FILENAME=True KTOOLBOX_JOB__FILENAME_FORMAT=[{title}]_{} KTOOLBOX_DOWNLOADER__RETRY_TIMES=10

Platform:
  • OS: Windows
  • Python Version 3.11.3
  • KToolBox Version 0.12.0

TheThirdComputer avatar Jan 26 '25 01:01 TheThirdComputer

This situation is somewhat special. The images of these posts are actually part of the content HTML code and are not located in the attachments or presented as a file. The content is directly saved as a file, with the default name being content.txt.

API Return:

{
    "id": "38430890",
    "user": "78028",
    "service": "patreon",
    "title": "Im going to make this twilight sparkle fanart Do i make her goth?",
    "content": "<p><img src=\"/e9/a1/e9a19cc0479e06634fa3f09a36be87ffe5acc190b031e7a681e3e19a579e61e0.png\"></p><p><br></p>",
    "embed": {},
    "shared_file": false,
    "added": "2021-08-17T02:39:38.199603",
    "published": "2020-06-20T09:57:29",
    "edited": "2020-06-20T09:57:29",
    "file": {
        "name": null,
        "path": null
    },
    "attachments": []
}

Content:

<p><img src=\"/e9/a1/e9a19cc0479e06634fa3f09a36be87ffe5acc190b031e7a681e3e19a579e61e0.png\"></p><p><br></p>

Ljzd-PRO avatar Feb 03 '25 07:02 Ljzd-PRO

Currently, parsing the content and downloading the images within it is not supported.

Ljzd-PRO avatar Feb 03 '25 07:02 Ljzd-PRO

Currently, parsing the content and downloading the images within it is not supported.

It would be really helpful when these kind of files are encountered, that a warning is shown when not possible to download. Could this be done? I hope so.

TheThirdComputer avatar Feb 11 '25 08:02 TheThirdComputer

https://github.com/Ljzd-PRO/KToolBox/releases/tag/v0.17.0

Ljzd-PRO avatar Jul 31 '25 15:07 Ljzd-PRO