Dictionary for unified naming
Currently we don't have naming conventions for "routing terms", which can easily lead to confusion.
For example we sometimes call a vertex a node, source can be the start point of an edge but also the start point of a route (see #394 ), etc..
@cvvergara proposes to use start_v and end_v instead of source and target, while I think we can make naming more easy to understand, for example with from_vertex and to_vertex, which makes clear. that the function requires a vertex to be the start and end of the route.
Again, should we name it vertex or node or maybe even point? Or would it be even more correct to append _id as we actually require the vertex ID as argument.
So I would propose a dictionary to unify the naming of routing terms, arguments, etc., that we don't use them altogether. I don't know how feasible this is, and if this would break backwards compatibility and needs to wait for v3.0.
I used _v it because I also made a problem definition and I can handle math in sphinx now and latex puts the _v as subscript.
In general, the functions that are not part of pgr_dijkstra, pgr_ksp, and pgr_drivingDistance will change to have the new characteristics (accept bigint, return meaningfull names), and as you can see I didn't modify their documentation. but they are going to disappear. Once, say for example, new pgr_aspaWarshall is coded, have the new characteristics like accept bigint, return meaningfull names, etc, then the documentation for the new version needs to comply to what ever this dictionary of terms/columns defines. The version documentation 2.0 remains intact. When 3.0 comes out, all the 2.0 disappears completely, as all the V2.0 signatures will also disappear.
For the inner sql query that the functions use, lets keep the meaning, Up to this point the pgr_dijkstra, pgr_ksp, and pgr_drivingDistance use a query that looks like
SELECT id, source, target, cost, reverse_cost FROM edge_table
And that query is interpreted internally in the C/C++ code get the information its going to process , so their meaning must remain.
This comment is to think about the description of sql query
NOTE: I learned that you can not define a term using the term you are defining, so I will try to be careful on that.
Things are different on how you store a weighted graph VS how its actually used.
For storing purposes this kind of structure is used:
id - identifier of the edge
source - first end point of the edge
target - second end point of the edge
cost - weight of the edge (source,target), negative if the edge (source,target) does not exist.
reverse_cost - weight of the edge (target, source), negative if the edge (target,source) does not exist
"other parameters" - i'll ignore this in the rest of this comment
Notice that I am using the word weight
so row i of an edge_table:
(id_i, source_i, target_i, cost_i, reverse_cost_i)
For processing, there are actually 2 edges defined with the same id_i, but locally they get a different id number say: local_id_k and local_id_k+1.
(local_id_k, id_i, source_i, target_i, cost_i )
(local_id_k+1, id_i, target_i, source_i, reverse_cost_i )
You can see that for processing purposes, the structure looks like
( local_id, u, v, weight(u,v) )
local_id - Vertex_descriptor given to the edge
u - first end point of the edge
v - second end point of the edge
weight(u,v) - cost when u = source and v= target or reverse_cost when u = target and v= source
So for processing the terms "cost" and "reverse_cost" don't exist
example: (5, 1, 3, 4, 6) <- edge id 5 from 1 to 3 cost = 4, reverse_cost = 6 becomes (1, 5, 1, 3, 4) <- local id 1 edge id 5 from 1 to 3 weight 4 (2, 5, 3, 1, 6) <- local id 1 edge id 5 from 3 to 1 weight 6
A path: given two vertices A,B a path is a sequence of vertices/edges that take you from A to B [A->B] = A , v2, v3, v4, ... B A route: given several vertices A,B,C,D, is form by a sequence of the paths: [A->B] [B->C][C->D]