python-dependency-injector icon indicating copy to clipboard operation
python-dependency-injector copied to clipboard

Degraded performance on large data set manipulation application

Open titouanfreville opened this issue 1 year ago • 0 comments
trafficstars

Hello, I use dependency injector as a bases for my projects in python for some time now and meet an unexpected issue recently.

I am currently building a data analysis software aiming to analyse large set of data (~3Go of data for 20 millions rows) and the process takes an unexpectedly long time to run and has a larger resource consumption.

As a basis, just getting the data take ~3 minutes without injecting dependencies while its not done after 20 minutes using it.

I am mainly using singleton containers and create the base project using wire system.

The test ran on python 12 under Microsoft dev container: mcr.microsoft.com/vscode/devcontainers/python:1-3.12, and a windows server running python 12 (don't have exact version but it can be asked if needed).

I processed the data using SQLAlchemy with pyodbc driver + pandas readsql methods.

I cannot provide the dataset I'm using as its private to the company I'm working.

The application is wrapped behind a Typer client application using async method (though parallelization is not correctly done yet as I'm new to it :innocent: )

Any feed back on this or idea is welcome as I don't really see why using DI could impact the code so much on this case.

Thanks for your work and time. <3

titouanfreville avatar May 25 '24 20:05 titouanfreville