gtfspy
gtfspy copied to clipboard
Example code creating an error in 'stop_I' for stops
The stop ids in the example are taken from the index in the pandas dataframe instead of the 'stop_I' column. This is not a problem in the Finnish dataset, but not all ids are the same the index number. (for example. the Dutch public transport dataset from www.openOV.nl starts with 1 instead of 0)
Code change proposals:
stop_dict = G.stops().to_dict("index")
for stop_I, data in stop_dict.items():
if data['name'] == from_stop_name:
from_stop_I = stop_I
if data['name'] == to_stop_name:
to_stop_I = stop_I
assert (from_stop_I is not None)
assert (to_stop_I is not None)
TO:
stop_data = OV_data.stops()
from_stop_I = stop_data[stop_data['name'] == from_stop_name].stop_I.values[0]
to_stop_I = stop_data[stop_data['name'] == to_stop_name].stop_I.values[0]
AND FROM:
stop_dict = G.stops().to_dict("index")
print("Origin: ", stop_dict[from_stop_I])
print("Destination: ", stop_dict[to_stop_I])
TO:
stop_data = OV_data.stops()
print("Origin: ", stop_data[stop_data['stop_I'] == from_stop_I])
print("Destination: ", stop_data[stop_data['stop_I'] == to_stop_I])
Which file are you referring to here?
Please provide also the (necessary) part for reproducing your problem & the error message that occurred.
Thanks!
The code is directly taken from the example in " gtfspy/examples/example_temporal_distance_profile.py". The code snippets are from line 15 - 22 and 69 - 71.
The problem can be seen using the first snippet of code (I called my rtfs object OV_data instead of G, which is not mentioned in previous comment, my apologies). No error message is present in this bug, however when adding a check in there for mismatches it becomes clear:
stop_dict = OV_data.stops().to_dict("index")
for stop_I, data in stop_dict.items():
if stop_I != data['stop_I']:
print('The index number is different from the actual ID')
print('Index: ' + str(stop_I))
print('Stop_I: ' + str(data['stop_I']))
break
if data['name'] == from_stop_name:
from_stop_I = stop_I
if data['name'] == to_stop_name:
to_stop_I = stop_I
assert (from_stop_I is not None)
assert (to_stop_I is not None)
This clearly indicates the following off-by-1 bug in the code:
The index number is different from the actual ID Index: 0 Stop_I: 1
Therefore the wrong stops are taken as start and end stop in the example.
Gtfs itself directly references towards the "stop_I" key, therefore using this key directly from pandas resolves the issue.
I hope I made it more clear, if not I am open for answering additional questions.