Digraphs Dreadnaut support

Broadly speaking, a dreadnaut file starts with "configuration" information about the graph, such as the number of vertices (denoted by 'n'), the start index for vertex numbering (denoted by '$') and whether or not a graph is a digraph (denoted by the presence of 'd'). The configuration section always ends with a 'g'. The rest of the file gives information concerning individual vertices in the form of adjacency lists. For example:

n=2
$=1
d
g
1: 1 2;
2: 2;

would represent a 1-indexed digraph with 2 vertices with edges {1,1}, {1,2}, {2,2}.

General overview:

Decoder:

DIGRAPHS_ParseDreadnautConfig aims to get values for either '$' (which indicates the start index for vertex numbering) or 'n' (which indicates the number of vertices). Note that '$' defaults to 0 and that I chose to reindex all graphs such that vertex numbering starts at one (which I think is convention for the Digraphs package?)
DIGRAPHS_LegalDreadnautEdge aims to filter out illegal edges and throws an error if an edge is illegal. An example of an illegal edge might be a loop for an undirected graph or an edge containing a vertex that is not allowed within the constraints of the values of '$' and 'n'. (In the case of illegal edges, nauty throws a warning message and then ignores the edge so I was trying to replicate this behaviour).
DIGRAPHS_SplitDreadnautLines effectively takes a line of dreadnaut (e.g. "1: 2 3 5; 4: 2 1 3; 2: 3;") and aims to split this into parts which are to be handled individually (in this case the parts would be ["1: 2 3 5;", "4: 2 1 3;", "2: 3;"]). The idea here is that although usually these parts would each be on their own line, it's techincally fine for some or all of them to share a line (with or without a semicolon) so I thought it made more sense to condense everything onto one line and then split into parts. There are various auxiliary commands that can be used within the dreadnaut format alongside the definition of the graph (more info here) which I mostly chose to neglect, with the exception of 'f' which defines a partition of vertices. Note that '$$' at the end of a file means reindex the graph to start counting at 0 (which I ignored).
DIGRAPHS_ParseDreadnautGraph intends to parse the non-configuration part of the file, which has been split into parts after being fed through to DIGRAPHS_SplitDreadnautLines

These are all combined in ReadDreadnautGraph.

Encoder: WriteDreadnautGraph takes a digraph and encodes into dreadnaut format.

I'm in the process of writing documentation!

May 26 '24 18:05 pramothragavan

@pramothragavan please let me know when you think this is ready again, and thanks !

May 30 '24 08:05 james-d-mitchell

@pramothragavan please let me know when you think this is ready again, and thanks !

Will do!

May 30 '24 09:05 pramothragavan

Hi @pramothragavan! Looking forward to hopefully seeing you soon for the new VIP.

What state did this Dreadnaut project get to? Would it be a good thing for you to get back into this semester if there's work still to do on it?

Jan 28 '25 15:01 mtorpey

Hi — there are definitely some kinks to be dealt with, but I think it would be a good place to start!

On 28 Jan 2025, at 15:15, Michael Young @.***> wrote:

Hi @pramothragavan https://github.com/pramothragavan! Looking forward to hopefully seeing you soon for the new VIP.

What state did this Dreadnaut project get to? Would it be a good thing for you to get back into this semester if there's work still to do on it?

— Reply to this email directly, view it on GitHub https://github.com/digraphs/Digraphs/pull/651#issuecomment-2619293665, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZXCLQY4AOD2PKUEJQWZQVT2M6NI5AVCNFSM6AAAAABIJ72E4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJZGI4TGNRWGU. You are receiving this because you were mentioned.

Jan 28 '25 16:01 pramothragavan

This is a significant overhaul on previous versions -- WriteDreadnautGraph is untouched, but the decoder has been completely rewritten.

As @james-d-mitchell suggested, I've taken the parser used in the dreadnaut program and effectively rewritten it in GAP. The original C code uses a stream to parse character by character. GAP has a Stream object, but this lacks some of the functionality needed, so I created a record called Stream that aligns GAP's streams with how they're used in C. Other helper functions I've added:

DIGRAPHS_GETNWC finds the next character in the stream that is not in " ,\t"
DIGRAPHS_GETNWL finds the next character in the stream that is not in " \n\t\r"
DIGRAPHS_readinteger reads integers from the stream (i.e. avoiding issues with reading "10" as opposed to "1" and "0" that might arise when parsing character by character)
DIGRAPHS_GetInt also reads the next integer from the stream. There are some instances where dreadnaut allows for an optional '=' character (e.g. n=2 is the same as n2). This function ignores any '=' characters and then calls DIGRAPHS_readinteger.
DIGRAPHS_readgraph parses the graph's adjacency data
DIGRAPHS_ParsePartition is used to parse a partition, if given. The partition is stored using vertex labels.

Documentation for various commands is given here (pages 6-12). Many of these are used to manipulate the graph and I have focused on supporting commands more closely tied to directly defining the graph.

For now, I need to write (many) tests but I'm also interested if there are any commands that you'd like to see support for. I'm happy to implement anything really, but didn't want to waste time on things you didn't want. The commands that I am currently supporting are:

All of those mentioned in section (A) of the above link. In dreadnaut, these would just define the mode which dreadnaut is using. This is important for subsequent use of nauty/traces, but is irrelevant for actually reading in the graph so this is just ignored.
From section (B): n=#, g (and all subcommands), _, __
From section (C): f
From section (D): $=#, $$, +, d, -d
From section (F): "...", !, q

I think a couple of the unsupported commands from (B) might be worth looking into. Anything unsupported currently should raise an InfoWarning, with the exception of <, >, e (these three relate to reading in, outputting and editing graphs) which raise ErrorNoReturn.

Feb 08 '25 18:02 pramothragavan

#485 I think this is basically complete, but I don't really like some of the behaviour with ReadDigraphs. As @james-d-mitchell requested verbally, I've implemented the hashsets WholeFileEncoders and WholeFileDecoders, which currently only contain WriteDreadnautGraph and ReadDreadnautGraph respectively, but is a bit more futureproof than the current implementation (hopefully DIMACS can also feature in ReadDigraphs?). There are also corresponding functions IsWholeFileEncoder and IsWholeFileDecoder.

Initially, the idea was that we need to handle whole file decoders separately to single line decoders in ReadDigraphs. However, it transpires that if multiple graphs were given in a .dre file, dreadnaut would just read in the last one (effectively overwriting the previous graph each time). This behaviour is mirrored by ReadDreadnautGraph (with an InfoWarning issued) and by extension ReadDigraphs, but this outcome feels unexpected.

I also think there is ambiguity as to whether you should specify DigraphFromDreadnautString or ReadDreadnautGraph as the optional decoder argument in ReadDigraphs. I think having two encoders/decoders might be overcomplicating things, especially given that ReadDigraphs could always be used instead of ReadDreadnautGraph. Same for writes.

Feb 26 '25 02:02 pramothragavan

No worries @pramothragavan it happens! I could also have checked before merging. Looks like you have a fix in the works, which is great!

Apr 09 '25 18:04 james-d-mitchell