gtfspy icon indicating copy to clipboard operation
gtfspy copied to clipboard

Example code creating an error in 'stop_I' for stops

Open BeishuizenTimKPMG opened this issue 5 years ago • 2 comments

The stop ids in the example are taken from the index in the pandas dataframe instead of the 'stop_I' column. This is not a problem in the Finnish dataset, but not all ids are the same the index number. (for example. the Dutch public transport dataset from www.openOV.nl starts with 1 instead of 0)

Code change proposals:

stop_dict = G.stops().to_dict("index")
for stop_I, data in stop_dict.items():
    if data['name'] == from_stop_name:
        from_stop_I = stop_I
    if data['name'] == to_stop_name:
        to_stop_I = stop_I
assert (from_stop_I is not None)
assert (to_stop_I is not None)

TO:

stop_data = OV_data.stops()

from_stop_I = stop_data[stop_data['name'] == from_stop_name].stop_I.values[0]
to_stop_I = stop_data[stop_data['name'] == to_stop_name].stop_I.values[0]

AND FROM:

stop_dict = G.stops().to_dict("index")
print("Origin: ", stop_dict[from_stop_I])
print("Destination: ", stop_dict[to_stop_I])

TO:

stop_data = OV_data.stops()
print("Origin: ", stop_data[stop_data['stop_I'] == from_stop_I])
print("Destination: ", stop_data[stop_data['stop_I'] == to_stop_I])

BeishuizenTimKPMG avatar Mar 04 '19 15:03 BeishuizenTimKPMG

Which file are you referring to here?

Please provide also the (necessary) part for reproducing your problem & the error message that occurred.

Thanks!

rmkujala avatar Mar 05 '19 08:03 rmkujala

The code is directly taken from the example in " gtfspy/examples/example_temporal_distance_profile.py". The code snippets are from line 15 - 22 and 69 - 71.

The problem can be seen using the first snippet of code (I called my rtfs object OV_data instead of G, which is not mentioned in previous comment, my apologies). No error message is present in this bug, however when adding a check in there for mismatches it becomes clear:

stop_dict = OV_data.stops().to_dict("index")
for stop_I, data in stop_dict.items():
    if stop_I != data['stop_I']:
        print('The index number is different from the actual ID')
        print('Index: ' + str(stop_I))
        print('Stop_I: ' + str(data['stop_I']))
        break
    if data['name'] == from_stop_name:
        from_stop_I = stop_I
    if data['name'] == to_stop_name:
        to_stop_I = stop_I
assert (from_stop_I is not None)
assert (to_stop_I is not None)

This clearly indicates the following off-by-1 bug in the code:

The index number is different from the actual ID Index: 0 Stop_I: 1

Therefore the wrong stops are taken as start and end stop in the example.

Gtfs itself directly references towards the "stop_I" key, therefore using this key directly from pandas resolves the issue.

I hope I made it more clear, if not I am open for answering additional questions.

BeishuizenTimKPMG avatar Mar 05 '19 08:03 BeishuizenTimKPMG