sage icon indicating copy to clipboard operation
sage copied to clipboard

random cograph generator

Open dcoudert opened this issue 1 year ago • 14 comments
trafficstars

Add a generator of random cographs.

:memo: Checklist

  • [x] The title is concise and informative.
  • [x] The description explains in detail what this PR is about.
  • [ ] I have linked a relevant issue or discussion.
  • [x] I have created tests covering the changes.
  • [x] I have updated the documentation and checked the documentation preview.

:hourglass: Dependencies

dcoudert avatar May 08 '24 18:05 dcoudert

Documentation preview for this PR (built with commit 2bc741ff9241badf46842137490e43236ca6b0c3; changes) is ready! :tada: This preview will update shortly after each push to this PR.

github-actions[bot] avatar May 08 '24 18:05 github-actions[bot]

some years ago there was a GSoC project which resulted in a cograph generator. https://github.com/MariannaSpyrakou/Cograph_generator perhaps we should add it to Sage?

dimpase avatar May 08 '24 22:05 dimpase

some years ago there was a GSoC project which resulted in a cograph generator. https://github.com/MariannaSpyrakou/Cograph_generator perhaps we should add it to Sage?

We can certainly add such a generator (in a separate PR), but the code you pointed requires substantial amount of work to get a suitable version. I did a first trial and it took me some time to make it work. I'll see if I can get a better and properly documented version. It will take some time.

dcoudert avatar May 09 '24 07:05 dcoudert

for this PR, can you say what kind of distribution you get for these random cographs?

dimpase avatar May 09 '24 10:05 dimpase

for this PR, can you say what kind of distribution you get for these random cographs?

I have no idea. I have not been able to find papers or code about the generation of random cographs of any order. The generator of networkx only produces cographs of order $2^h$, where $h$ is the depth of recursive calls, and is based on join (or disjoint union) of 2 cographs of equal size.

dcoudert avatar May 09 '24 10:05 dcoudert

the usual bias to come from would be from.repititions (your recursion could lead to an isomorphic graph from different random choices).

It would be good to have a cograph generator to test your code for non-uniformity of the distribution)

dimpase avatar May 09 '24 12:05 dimpase

see https://doc.sagemath.org/html/en/reference/combinat/sage/combinat/partition.html#sage.combinat.partition.Partitions_n.random_element_uniform

for random partitions of n (uniform, and another natural measure, there, too)

dimpase avatar May 09 '24 19:05 dimpase

I'm working on the cographs generator. I will propose a PR asap.

I'm not sure of the algorithm you propose using random partitions. Can you develop the idea ?

dcoudert avatar May 10 '24 08:05 dcoudert

The idea is that all we need is to be able to generate a random partition $p(n)=p_1+p_2+\dots +p_k$ of $n,$ distributed according to $c_{(p(n))}:=\prod_j c_{p_j}$, where $c_t$ denotes the number of (unlabelled) cographs on $t$ nodes. Indeed, $p(n)$ contributes $2c_{(p(n))}$ cographs to the total $c_n$, so we need to weight $p(n)$ with $2\frac{c_{(p(n))}}{c_n}$, right? I hope the technique used for uniform generation of partitions from Chapter 10 of https://www2.math.upenn.edu/~wilf/website/CombinatorialAlgorithms.pdf can be adapted to this more general case. Formulae for $c_t$ are known.

dimpase avatar May 10 '24 11:05 dimpase

I had a look at the formulae for computing $c_t$ from https://oeis.org/A000084 and https://oeis.org/A000669 (we have $c_t = A000084(t) = 2 * A000669(t)$). I don't see how to code efficiently the exact methods. However, we can use an approximation:

def A000084_approx(n):
    c = 0.4127628892015780637002715741440553
    d = 3.56083930953894332952612917270966777
    return Integer(round(c * (d**n) / (n**(3/2)), 0))

and we get

n	exact	approx
1	1	1
2	2	2
3	4	4
4	10	8
5	24	21
6	66	57
7	180	162
8	522	472
9	1532	1407
10	4624	4278
11	14136	13203
12	43930	41263
13	137908	130307
14	437502	415185
15	1399068	1333059
16	4507352	4308823
17	14611576	14009339
18	47633486	45786205
19	156047204	150336827
20	513477502	495682017
21	1696305728	1640482456
22	5623993944	5447771544
23	18706733128	18147363135
24	62408176762	60623470791
25	208769240140	203048610564
26	700129713630	681714275880
27	2353386723912	2293871787921

The next step will be to adapt method random_element_uniform...

dcoudert avatar May 11 '24 09:05 dcoudert

One thing I don't see is how to deal with isomorphic (co)connected components. With the approach I proposed there would be a bias towards getting isomorphic (co)-components more often than we want. This bias can be basically ignored for large components, though, as hitting an isomorphic one would get improbable.

dimpase avatar May 11 '24 09:05 dimpase

OTOH, instead of using $c_t$ one can use a (probably) simpler count, which does count swapped isomorphic (co)components as different graphs. And such a count captures the algorithm exactly, i.e. using it for adjusting the probability is exactly what's needed.

dimpase avatar May 11 '24 12:05 dimpase

Well, I'm lost. I did several trials to modify the random permutation method but I'm far from getting something uniform. Actually, I combine several calls to the random permutation method to get cotrees, but I'm getting most of the time $(1, 1, \dots, 1)$ and the other cotrees are relatively rare events. I don't understand how to properly modify the random permutation method to take care of the $c_t$ numbers (or any other suitable value).

dcoudert avatar May 11 '24 15:05 dcoudert

Lemma 5.1 in https://arxiv.org/pdf/1906.10355 presents a method for generating random labelled cographs uniformly at random. It uses Galton–Watson trees conditioned on producing $n$ leaves. Section 6 of the same paper explains how to generate random unlabelled cographs uniformly at random from random Polya trees with $n$ leaves. Help for understanding how to implement such generators is more than welcome.

dcoudert avatar May 18 '24 17:05 dcoudert