xls2xlsx icon indicating copy to clipboard operation
xls2xlsx copied to clipboard

Doesn't seem to work with bigger files in an Azure runbook

Open DutchHarry opened this issue 2 years ago • 0 comments

  • xls2xlsx version: 0.2.0
  • Python version: 3.10
  • Operating System: Azure Runbook

Description

converting .xls to xlsx. script (below) works with for say 700k .xls files running in An Azure runbook and running in a local python window. And for a 10M size .xls it works on local python, but fails in an Azure Runbook.

What I Did

#!/usr/bin/env python3

from azure.storage.blob import BlobServiceClient import xlrd import openpyxl import io from xls2xlsx import XLS2XLSX

connection_string = "a" container_name = "b"

blob_service_client = BlobServiceClient.from_connection_string(connection_string) container_client = blob_service_client.get_container_client(container_name)

for blob in container_client.list_blobs(name_starts_with = "20"): if ".xlsx" not in blob.name: if ".xls" in blob.name: blob_client = container_client.get_blob_client(blob.name) excel_bytes = blob_client.download_blob().readall() excel_file = io.BytesIO(excel_bytes) x2x = XLS2XLSX(excel_file) wb = x2x.to_xlsx() xlsx_bytes = io.BytesIO() wb.save(xlsx_bytes) xlsx_bytes.seek(0) new_blob_name = blob.name+'x' new_blob_client = container_client.get_blob_client(new_blob_name) new_blob_client.upload_blob(xlsx_bytes, overwrite=True)

delete the original .xls

blob_client.delete_blob()

DutchHarry avatar Nov 29 '23 17:11 DutchHarry