Office365-REST-Python-Client icon indicating copy to clipboard operation
Office365-REST-Python-Client copied to clipboard

download_session fails with ctx.execute_query_retry

Open beliaev-maksim opened this issue 3 years ago • 5 comments

way to reproduce: set download session and try to execute it with ctx.execute_query_retry disconnect your internet cable

Error: ('-1, Microsoft.SharePoint.Client.InvalidClientQueryException', 'The HTTP POST method cannot be used for media resource. The media resource could only be accessed by HTTP GET or updated by HTTP PUT.', "400 Client Error: Bad Request for url: https://company.sharepoint.com/sites/BetaTest/_api/Web/getFileByServerRelativeUrl('%2Fsites%2FBetaTest%2FShared%20Documents%2Fwinx64%2FLicenseManager%2Fv221%2F20210909_1613%2Fv221_LicenseManager.zip')/$value")

you will see that method to download file (that should be GET) is changed to POST and that is why it is failing.

snippet used to reproduce:

                progress_func = lambda offset: self.print_download_progress(offset, total_size=file_size)
                remote_file.download_session(zip_file, progress_func, chunk_size=1*1024*1024)
                self.ctx.execute_query_retry(
                    max_retry=100,
                    exceptions=(
                        ClientRequestException,
                        HTTPError,
                        RequestException,
                        ConnectionError,
                        ConnectionResetError,
                    ),
                    failure_callback=lambda att, exc: logging.error(f"Attempt {att}. Error: {exc}")
                )

beliaev-maksim avatar Sep 10 '21 12:09 beliaev-maksim

I'm getting a similar error, I think possibly due to threaded usage of the connection. Is there a good way to prevent this?

@login_again_if_fails
def download_file(filepath, ctx=None):
    ctx = check_login(ctx)
    def print_download_progress(offset):
        print("Downloaded '{}' bytes... of: {}".format(offset, filepath))
    source_file = ctx.web.get_file_by_server_relative_url(filepath)
    file_metadata = source_file.expand(["versions", "listItemAllFields", "Author"]).get().execute_query()
    # file_item_fields = source_file.listItemAllFields
    # file_item = file_item_fields.select(["EffectiveBasePermissions"]).get().execute_query()  # type: ListItem
    compound_version_string = f'{file_metadata.major_version}.{file_metadata.minor_version}'
    modification_author_info = source_file.modified_by.get().execute_query()
    modification_author_name = modification_author_info.properties['Title']
    local_file = io.BytesIO()
    source_file.download_session(local_file, print_download_progress).execute_query()
    return local_file, compound_version_string

and the error:

File "/home/vcap/app/data_loaders/o365.py", line 133, in download_file
file_metadata = source_file.expand(["versions", "listItemAllFields", "Author"]).get().execute_query()
File "/home/vcap/deps/0/python/lib/python3.10/site-packages/office365/runtime/client_object.py", line 52, in execute_query
self.context.execute_query()
File "/home/vcap/deps/0/python/lib/python3.10/site-packages/office365/runtime/client_runtime_context.py", line 142, in execute_query
self.pending_request().execute_query()
File "/home/vcap/deps/0/python/lib/python3.10/site-packages/office365/runtime/client_request.py", line 86, in execute_query
raise ClientRequestException(*e.args, response=e.response)
office365.runtime.client_request_exception.ClientRequestException: ('-1, Microsoft.SharePoint.Client.InvalidClientQueryException', 'The HTTP POST method cannot be used for media resource. The media resource could only be accessed by HTTP GET or updated by HTTP PUT.', "400 Client Error: Bad Request for url: https://mycompany.sharepoint.com/sites/my_area/_api/Web/GetFileById('myFileID_obscured')/$value")

nmz787-intel avatar Aug 15 '22 18:08 nmz787-intel

I think it's due to https://github.com/vgrem/Office365-REST-Python-Client/blob/1bebf978ae8920823e8b3948ff20d2ad6448c69b/office365/onedrive/driveitems/driveItem.py#L208 only being called once as the beforeExecute to change the request method, once the request is retried that callback has been removed https://github.com/vgrem/Office365-REST-Python-Client/blob/1bebf978ae8920823e8b3948ff20d2ad6448c69b/office365/runtime/odata/request.py#L43 defaults the method to POST for ServiceOperationQuery

AllexVeldman avatar Mar 15 '23 14:03 AllexVeldman

is there any way to get the remote file size, to keep re-trying the download until 100% is obtained locally? I get corrupted file downloads every week or three it seems.

nmz787-intel avatar Jun 23 '23 17:06 nmz787-intel

@nmz787-intel https://github.com/ansys-internal/automatic-installer/blob/071a993e548776c58093a2da702b5b6ea8527304/downloader_backend.py#L693

beliaev-maksim avatar Jun 24 '23 13:06 beliaev-maksim

This issue still happens.

        file_size = myfile.length
        ctx.execute_query()
        def progress_func(offset):
            print(f"Downloaded '{offset}' bytes... of '{file_size}' bytes.")

        myfile.download_session(local_file, progress_func, chunk_size=1 * 1024 * 1024)
        ctx.execute_query()

results in office365.runtime.client_request_exception.ClientRequestException: ('-1, Microsoft.SharePoint.Client.InvalidClientQueryException', 'The HTTP POST method cannot be used for media resource. The media resource could only be accessed by HTTP GET or updated by HTTP PUT.', "400 Client Error: Bad Request for url: https://google0.sharepoint.com/sites/MySite/_api/Web/GetFileById('3dsc21D2-4986-907s-8460-8dsasce2145f4')/$value")

a filthy fix:

        try:
            myfile.download_session(local_file, progress_func, chunk_size=1 * 1024 * 1024).execute_query()
        except:
            myfile.download_session(local_file, progress_func, chunk_size=1 * 1024 * 1024).execute_query()

only solution I've found so far

C-monC avatar Aug 11 '23 11:08 C-monC