xdem icon indicating copy to clipboard operation
xdem copied to clipboard

Add profiling module

Open adebardo opened this issue 6 months ago • 9 comments

Context

The goal of this ticket is to integrate a profiling module into xDEM in order to better monitor functions that are costly in terms of memory and execution time.

To achieve this, we will draw inspiration from the profiling tool implemented in Pandora (source here). However, several adaptations are needed to tailor the profiler to the xDEM context.

Tasks

  • [ ] Replace enable_from_config with a simplified method that does not require json_checker. The new API should look like: enable(save_graphs=True, save_raw_data=False)
  • [ ] Add unit tests to ensure the profiler behaves as expected
  • [ ] Add documentation for the new profiler module and usage guidelines
  • [ ] Create profiling for first functions as reprojection, subsample, interpolation

/estimate 5d

adebardo avatar Jun 18 '25 07:06 adebardo

Thanks @adebardo, this would be an essential addition indeed! 🙂

My main feeback at this stage:

If understand correctly after looking at Pandora's code, we would rely primarily on psutil? It would be nice to explain this in the issue description directly: what new tools would be used, how (what behaviour/output we should expect in GeoUtils/xDEM) and why pick this solution.

I'm not an expert in profiling, but I have used or heard of several other tools that are widely used (each with 10,000+ stars on GitHub and a massive user-base), such as the base Python cProfiler, memray for memory profiling, scalene for memory+CPU profiling, py-spy for CPU profiling with low overhead, etc...

Given this landscape of tools, this raises the questions: Is there a reason to use psutil specifically? What are the advantages/drawbacks? I think we need much more detail to fully understand the change ahead and learn from your experience on Pandora here :wink:

rhugonnet avatar Jun 18 '25 18:06 rhugonnet

Our idea here would be to implement profiling in production, with a global and easy-to-use system in case we need more detailed insights into functions. The tools you mentioned are more external and complex (we've already tried Memray, and it wasn't very conclusive for our needs). The metrics returned by psutil are more than sufficient for our requirements.

adebardo avatar Jun 19 '25 08:06 adebardo

We tried using Scalene to profile Blockwise, but the results weren't relevant, and the process took longer (possibly due to a lack of knowledge on my part).

adebardo avatar Jun 19 '25 08:06 adebardo

I'm trying to make a comparison table sometime next week :)

adebardo avatar Jun 19 '25 15:06 adebardo

We want to implement a performance monitoring system for time and memory in production.

Based on our experience, we recommend using the psutils library, which is useful for tracking performance. It is a relatively lightweight and easy-to-use library.

For visualization, we suggest combining it with the plotly tool. These dependencies could be activated in an xdem mode and not loaded in a light mode for example.

This method can also, with a few modifications by the user or developper, allow for line-by-line profiling to be implemented.

Tool Type CPU Tracking Memory Tracking Detail Level (Function/Line) Built-in Visualization Recommended Use Case
psutil System monitoring Yes Yes No No (combine with Plotly/Dash) Continuous monitoring in production
Scalene CPU + memory profiler Yes Yes (line by line) Yes (line by line) Yes (HTML report with charts) In-depth CPU/memory diagnostics during code optimization
Memray Deep memory profiler No Yes (native + Python allocations) Yes (full stack trace) Yes (HTML flamegraph) Leak detection and memory spikes, detailed native allocation analysis
Py-spy External sampling profiler Yes (sampling) No Yes (flamegraph) Yes (Speedscope, SVG) Lightweight profiling without modifying code, suitable for production
cProfile Standard Python profiler Yes No Yes (per function) No (use with SnakeViz, gprof2dot) Integrated baseline profiler, good for identifying bottlenecks early
line_profiler Line-by-line time profiler Yes No Yes (line by line) No Precise timing for critical function sections
memory_profiler Line-by-line memory profiler No Yes (line by line) Yes (line by line) No Detailed memory tracking to pinpoint memory-heavy lines

adebardo avatar Jun 30 '25 09:06 adebardo

Type of graph we can produce :

Image Image

adebardo avatar Jul 01 '25 07:07 adebardo

Type of graph we can produce :

Image Image

In the second graph, is it possible to show when the application starts and ends? If I understand correctly, we only have information about when it starts, not when it finishes.

belletva avatar Jul 01 '25 07:07 belletva

Type of graph we can produce : Image Image

In the second graph, is it possible to show when the application starts and ends? If I understand correctly, we only have information about when it starts, not when it finishes.

Ye it's possible to add this functionnality by measuring the returning time thks to "return_time" from psutils

adebardo avatar Jul 01 '25 08:07 adebardo

I agree this would be a nice tool to have! I don't have much experience with profiling, so I trust your experience on this. Of course, it would be useful to be able to know specifically which lines have a long running time or memory usage, but if other Python packages are not well suited, let's go with psutils.

adehecq avatar Jul 30 '25 08:07 adehecq