flow_stability icon indicating copy to clipboard operation
flow_stability copied to clipboard

Facilitate analysis of a sequence of temporal networks

Open j-i-l opened this issue 10 months ago • 3 comments

The user interface should extend the one implemented in #53

We want to be able to process large temporal networks, in particular networks that extend over a considerable amount of time (relative to the average interaction duration). One way to deal with such networks is to split them up into a sequence of shorter networks and perform a flow stability analysis on each temporal network in the sequence.

Approach

The interface should be structured similarly to the FlowStability class (see #53). We can define a FlowStabilitySequence class that inherits from FlowStability. FlowStability, in turn, can inherit from base class (one that does only require arguments specific to each element in the sequence and for a simple analysis) and the sequence in a FlowStabilitySequence instance can then contain a list of base class instances.

Clarify

  • What additional parameters are required to specify a FlowStabilitySequence?
  • How should be address the processing of each element in the sequence? Do we want to include a parallelization step with multiprocessing and/or adhere to an architecture that is well suited for HPC clusters?

j-i-l avatar Jan 26 '25 23:01 j-i-l

Looking at the current scripts, I would say that these are the "meta-parameters":

from run_laplacians_transmats.py:

optional.add_argument("--ncpu", default=4, type=int,
                      help="Size of the multiprocessing pool.")

optional.add_argument("--num_slices", default=50, type=int,
                help="number of slices that will be used to parallelize and save the results")

optional.add_argument("--slice_length", default=None, type=float,
                help="Length of a single slice. Used to set the number of slices for parallelization instead of num_slices. If provided, will have priority over num_slices.")

optional.add_argument("--t0", default=None, type=float,
                help="time when to start the analysis. Default is the starting time of the first event.")

optional.add_argument("--tend", default=None, type=float,
                help="time when to stop the analysis. Default is the ending time of the last event.")

optional.add_argument("--verbose", action="store_true")

optional.add_argument("--batch_num", default=0, type=int,
                help="if the work is splitted in several batches (to split over several computers), batch numer")

optional.add_argument("--total_num_batches", default=1, type=int)

optional.add_argument("--time_slices_from_net_file", action="store_true",
                help="Uses the time slices saved with the TemporalNetwork file, in `net.time_slices_bounds`.")

optional.add_argument("--intervals_to_skip", default=[], type=int, nargs="+",
                help="list of intervals to skip. given as '(int1 int2 ...)'")

from run_cov_integrals.py:

optional.add_argument("--num_points", default=50, type=int,
                      help="Number of steps of the grid overwhich the integral results will be saved.")

optional.add_argument("--int_length",default=None, type=int,
                help="Length of a single grid interval. Used to set the number of intervals instead of num_points")

optional.add_argument("--int_list", default=[], type=int, nargs="+",
                help="List of intervals used for the integral. Used instead of num_points or int_length.")

optional.add_argument("--time_direction", default="both", type=str,
                help="can be 'forward','backward' or 'both'. Default is 'both'.")

optional.add_argument("--only_from_start_and_finish", action="store_true",
                help="instead of computing every combinations of start and finish, will compute every integrals forward from start and backward from finish.")

optional.add_argument("--only_from_start", action="store_true",
                help="instead of computing every combinations of start and finish, will compute every integrals forward from start.")

optional.add_argument("--only_from_finish", action="store_true",
                help="instead of computing every combinations of start and finish, will compute every integrals backward from finish.")

optional.add_argument("--only_one_interval", action="store_true",
                help="instead of computing every combinations of start and finish, will compute from every start but only for one interval.")

optional.add_argument("--print_mem_usage", action="store_true",
                help="print memory usage.")

optional.add_argument("--print_interval", default=100, type=int,
                help="Controls how often memory usage is printed.")

from run_clusterings.py:

optional.add_argument("--nproc_files", default=4, type=int,
                help="Number of processes over which to split files to work on.")

optional.add_argument("--nproc_clustering", default=1, type=int,
                help="Number of processes over which to split clustering iterations.")

optional.add_argument("--init_p1", action="store_true",
                help="For non-homogeneous initial distribution, must be used with --direction.")

optional.add_argument("--direction", default="forward",
                help="'forward' or 'backward', used with --init_p1.")

However, the scripts offer a lot of flexibility, probably too much. So, we may remove some functionality. For example, force --only_from_start_and_finish.

alexbovet avatar Feb 17 '25 09:02 alexbovet

I think the use case will be more multiprocessing on a machine with many cpus rather than using HPC clusters. So, let's focus on this for the moment.

alexbovet avatar Feb 17 '25 09:02 alexbovet

I think the use case will be more multiprocessing on a machine with many cpus rather than using HPC clusters. So, let's focus on this for the moment.

Makes sense. This covers a broader range of use cases in my opinion. Also with a multi-cpu enabled code we can still benefit from an HPC cluster for embarrassingly parallel split ups, configuring single jobs to run on multiple cpu's.

j-i-l avatar Feb 17 '25 13:02 j-i-l