r5 icon indicating copy to clipboard operation
r5 copied to clipboard

Fix naming: `bundleId` -> `graphId` -> `networkId` ... choose 1?

Open trevorgerhardt opened this issue 3 years ago • 2 comments

An incoming single point analysis request contains the bundleId and set's it on the AnalysisWorkerTask as the graphId here: https://github.com/conveyal/r5/blob/f71de5235a8ec773e91f3ddbec134bb5f5fecdb2/src/main/java/com/conveyal/analysis/models/AnalysisRequest.java#L202 Later, the worker task is used in the NetworkPreloader and set's the graphId to be a networkId here: https://github.com/conveyal/r5/blob/f71de5235a8ec773e91f3ddbec134bb5f5fecdb2/src/main/java/com/conveyal/r5/analyst/NetworkPreloader.java#L182

If possible, we should the same name when we are referring to the same thing. We can allow for more descriptive versions, but we should choose one of bundle or graph or network if we can.

Next step would be to find all of the different areas we are using each of the names.

graphId

  • AnalysisWorkerTask
  • WorkerCategory

bundleId

  • AnalysisRequest

networkId

  • NetworkPreloader
  • TransportNetworkCache

scenarioId

  • TransportNetwork

trevorgerhardt avatar Jan 28 '22 06:01 trevorgerhardt

Ah, the story gets deeper. In TransportNetwork we also use scenarioId to refer to the same thing:

https://github.com/conveyal/r5/blob/f71de5235a8ec773e91f3ddbec134bb5f5fecdb2/src/main/java/com/conveyal/r5/transit/TransportNetworkCache.java#L171

trevorgerhardt avatar Jan 31 '22 03:01 trevorgerhardt

The story behind this is that the one big OTP data structure containing both transit and streets has always been called a "graph". When we made R5 the transit was no longer explicitly represented as a graph, and we thought it better to use a name from the domain (transportataion) instead of what was used to model it (a graph). These terms got baked into various components that used to be separate projects used at different times in both OTP and R5 contexts.

Bundle, network, and scenario really are three different things. I never liked the term "bundle" or the fact that multiple feeds and OSM files were permanently fused together into a bundle at the moment they were uploaded - I prefer having free-floating "data sources" that are assembled at the moment we manually trigger a network build, but we haven't fully realized that yet.

So we really do have Bundles, Graphs, Networks, and Scenarios, we've just tried to use the same ID for all of them by convention (or suffixed the same ID when there's a one-to-many relationship like scenarios) so you don't have to look up which bundle produced which network etc.

So I think the best approach is to actually remove graphId (call everything networkId) and eliminate bundleId by manually triggering network builds. A scenario ID really is distinct from a network ID though it is tied to one specific network ID.

abyrd avatar Jan 31 '22 04:01 abyrd