pyopenms_viz icon indicating copy to clipboard operation
pyopenms_viz copied to clipboard

Simplifying MultiPlot Interface

Open jcharkow opened this issue 9 months ago • 5 comments

Currently, plotting across several axes in pyopenms_viz can be cumbersome. Some examples include:

https://github.com/OpenMS/pyopenms_viz/blob/main/docs/gallery_scripts_template/plot_spyogenes_subplots_ms_matplotlib.py

https://github.com/OpenMS/pyopenms_viz/blob/main/docs/gallery_scripts_template/plot_investigate_spectrum_binning_ms_matplotlib.py

In both of these scripts, the .plot function must be called multiple times in a loop and in the matplotlib backend the figure must be created beforehand.

One idea to address this would be to include a tile_by parameter or something along those lines which would divide the dataframe into different subplots based on a specific column.

Current functionality:

##### Set Plotting Variables #####
pd.options.plotting.backend = "ms_matplotlib"
RUN_NAMES = [
    "Run #0 Spyogenes 0% human plasma",
    "Run #1 Spyogenes 0% human plasma",
    "Run #2 Spyogenes 0% human plasma",
    "Run #3 Spyogenes 10% human plasma",
    "Run #4 Spyogenes 10% human plasma",
    "Run #5 Spyogenes 10% human plasma",
]

fig, axs = plt.subplots(len(np.unique(chrom_df["run"])), 1, figsize=(10, 15))

# plt.close ### required for running in jupyter notebook setting

# For each run fill in the axs object with the corresponding chromatogram
plot_list = []
for i, run in enumerate(RUN_NAMES):
    run_df = chrom_df[chrom_df["run_name"] == run]
    current_bounds = annotation_bounds[annotation_bounds["run"] == run]

    run_df.plot(
        kind="chromatogram",
        x="rt",
        y="int",
        grid=False,
        by="ion_annotation",
        title=run_df.iloc[0]["run_name"],
        title_font_size=16,
        xaxis_label_font_size=14,
        yaxis_label_font_size=14,
        xaxis_tick_font_size=12,
        yaxis_tick_font_size=12,
        canvas=axs[i],
        relative_intensity=True,
        annotation_data=current_bounds,
        xlabel="Retention Time (sec)",
        ylabel="Relative\nIntensity",
        annotation_legend_config=dict(show=False),
        legend_config={"show": False},
    )

fig.tight_layout()
fig

Proposed implementation

# Add tile_by argument and delete need for specifying canvas
    run_df.plot(
        kind="chromatogram",
        x="rt",
        y="int",
        grid=False,
        tile_by='run_name',
        by="ion_annotation",
        title=run_df.iloc[0]["run_name"],
        title_font_size=16,
        xaxis_label_font_size=14,
        yaxis_label_font_size=14,
        xaxis_tick_font_size=12,
        yaxis_tick_font_size=12,
        relative_intensity=True,
        annotation_data=current_bounds,
        xlabel="Retention Time (sec)",
        ylabel="Relative\nIntensity",

    )

Implementation Ideas:

  1. Add new parameters to _config.py
  2. The best place to add this would be in the _core.py so the changes are abstracted across the different backends.
  3. For now I would focus on ChromatogramPlot and SpctrumPlot classes as PeakMapPlot might be difficult with the marginal plots.
  4. For now just focus on a single column (1 by N figure) however being able to specify the number of columns in the plot would be a nice addition
  5. I would recommend either adding a new plot_tile method to needed classes (e.g. ChromatogramPlot, SpectrumPlot) or extending the current plot function to support tiling. Ideally, we do not want to be repeating too much code. Another idea is to add a _plot_helper() which is called by both plot_tile and plot. Play around with different approaches and find a balance between minimizing code repetition and readability.

Considerations to keep in mind:

  1. Must add a check that the tile_by column is valid
  2. For chromatograms, annotation dataframes must also have the "tile_by" column so we can separate the annotations
  3. Tests should also be added to verify this new functionality
  4. Specifically for chromatgroams, should only need 1 legend across all plots, fragment ions should be the same colours across all plots (might need to ensure fragment plots are sorted)
  5. Ensure documentation and function documentation is up to date with new parameters

jcharkow avatar Mar 23 '25 17:03 jcharkow

@Nishantrde let me know if you are interested in working on this

jcharkow avatar Mar 23 '25 17:03 jcharkow

Hi @jcharkow , I'm definitely interested in working on this! The proposed enhancement to include a tile_by parameter sounds like a great improvement for simplifying multi-axis plots in pyopenms_viz. You can assign this issue to me.

Nishantrde avatar Mar 25 '25 02:03 Nishantrde

I would suggest maybe adopting facet_col/facet_row similar to how plotly does it.

https://plotly.com/python/facet-plots/

singjc avatar Mar 25 '25 03:03 singjc

I would suggest maybe adopting facet_col/facet_row similar to how plotly does it.

https://plotly.com/python/facet-plots/

Yes this would be preferred to tile_by

jcharkow avatar Mar 25 '25 12:03 jcharkow

Hi @jcharkow I have implemented tile_by function in ChromatogramPlot

Nishantrde avatar Mar 26 '25 05:03 Nishantrde