ngraph.graph icon indicating copy to clipboard operation
ngraph.graph copied to clipboard

Inconsistencies with links, multigraphs and non-oriented graphs

Open regorxxx opened this issue 7 months ago • 0 comments

I have been using and old version since years, but the newer versions are even worse in this aspect (due to map usage and ids).

When 2 nodes have multiple links between them (i.e. multigraph), the current getLink() method always retrieve the same link with no way to know if there are more or which one to choose. In newer versions getLink just retrieves the first one created which happens to match the non unique ID.

While this is irrelevant most times, when using a graph for path calculations, if you want to retrieve a link between 2 nodes, you usually want an specific link (i.e. the shortest one), not the first one. In previous versions you would iterate over the node links, so back to that (instead of relying on the map), you would need to add some kind of function which must be minimized.

  function getLink(aNodeId, bNodeId, minimizer) {
    // TODO: Use sorted links to speed this up
    var node = getNode(aNodeId),
      i;
    if (!node || !node.links) {
      return null;
    }
    let link = null;
    let iLink;
    let min = Infinity;
    for (i = 0; i < node.links.length; ++i) {
      iLink = node.links[i];
      if (iLink.fromId === aNodeId && iLink.toId === bNodeId) {
        if (minimizer) {
            const newMin = minimizer(iLink);
            if (newMin < min) {
               min = newMin;
               link = iLink;
            } else if (!link && newMin === min ) {
				link = iLink;
			}
        } else {link = iLink; break;}
      }
    }
    return link;
  }

Note iterating over current node links is pretty much as fast as the map, since is just a subset of links. You are not really gaining a lot, since every node should have at least a link in most use cases. So getNode iterates all nodes which should have the same size than the links map (or even smaller in most cases).

Then it's as simple as

mygraph.getLink(nodeA, nodeB, (link) => link.data.weight);

Even if you don't add something like this, the current behavior for multigraphs should be warned at least (or a new method given to retrieve all possible links between A and B).

Then... continuing with this, non oriented graphs! (which to me is equal to multigraphs handling). getLink(a,b) is different to geLink(b,a). Since there is no specific option for oriented/non oriented graphs, a new method should be created to to retrieve any of those links.

  function getNonOrientedLink(aNodeId, bNodeId, minimizer) {
    // TODO: Use sorted links to speed this up
    var node = getNode(aNodeId),
      i;
    if (!node || !node.links) {
      return null;
    }
    let link = null;
    let iLink;
    let min = Infinity;
    for (i = 0; i < node.links.length; ++i) {
      iLink = node.links[i];
      if (iLink.fromId === aNodeId && iLink.toId === bNodeId || iLink.fromId === bNodeId && iLink.toId === aNodeId) {
        if (minimizer) {
            const newMin = minimizer(iLink);
            if (newMin < min) {
               min = newMin;
               link = iLink;
            } else if (!link && newMin === min ) {
				link = iLink;
			}
        } else {link = iLink; break;}
      }
    }
    return link;
  }

Finally, if maps are to be used. A better way to handle multigraphs could be just creating a map of arrays.

function addLink(fromId, toId, data) {
    enterModification();

    var fromNode = getNode(fromId) || addNode(fromId);
    var toNode = getNode(toId) || addNode(toId);

    var link = createLink(fromId, toId, data);
    var isUpdate = links.has(link.id);

    const idLinks = links.get(link.id);
    (if !idLinks || !options.multigraph || !options.oriented) {links.set(link.id, [link]);}
    else {idLinks.push(link);}

    // TODO: this is not cool. On large graphs potentially would consume more memory.
    addLinkToNode(fromNode, link);
    if (fromId !== toId) {
      // make sure we are not duplicating links for self-loops
      addLinkToNode(toNode, link);
    }

    recordLinkChange(link, isUpdate ? 'update' : 'add');

    exitModification();

    return link;
  }

And then you can have infinite links between nodes that way, no need to differentiate between multigraphs or not (there is also no need for uniqueIds). In single-edge-graphs, getLink would just retrieve the first link of the array. On multi-graphs, either the first one or one according to the minimizer.

Note I added a check for multigraphs in case you explicitly want to force single links.

The good thing about this approach is that you can directly enforce a oriented/non oriented option. For non oriented graphs, you can check whether there is already a link a-b or b-a and treat as a non-multigraph. Oriented graphs would simply be multigraphs with arrays of length <=2.

function makeLinkId(fromId, toId) { // With this there is no need to check both cases for non oriented graphs
  return [fromId, toId].sort().join( '👉 ');
}

Checking oriented graphs would be a matter of changing getLink, to match the from and to in the proper order for the array of 2 links.

Not doing a PR since these are breaking changes and don't want to spend time on something may not be implemented at all.

regorxxx avatar Nov 22 '23 18:11 regorxxx