urbanaccess icon indicating copy to clipboard operation
urbanaccess copied to clipboard

Specific Time Point Aggregations

Open d-wasserman opened this issue 5 years ago • 2 comments

Hi @sablanchard,

We are evaluating using Pandana for a project, but getting deeper into the API it seem you use the average headway to identify the appropriate wait times. I have used Pandana before for its speed, and hope to again, but I need to investigate access variability as well as the average (specific time points are helpful here).

  1. If you use a very constrained time range to define a network, does not fill out all the headway values as a result?
  2. I saw headway statistics can also get the standard deviation, min, and max. One approach could be to manually adjust headway based on some assumed distribution of arrival events to show a range of accessibility values. Is it possible to adjust network headways up or down manually ?
  3. Is it on the road map to test at specific time points, or is that pandana incompatible for now?

d-wasserman avatar May 17 '19 18:05 d-wasserman

Hi @d-wasserman , Thank you for your questions. To answer:

  1. The definition of the time range for the network and the time range of the headway calculation are separate time ranges and can be specified separately and do not have to be the same. For example, you could if you wanted to, create a network representing 8:00-11:00am but use headways from a different time period such as 8:00-9:00am. See the time range parameter here: https://udst.github.io/urbanaccess/data_loaders.html#urbanaccess.gtfs.headways.headways. This may result in some null headways when joined back to the network graph during the network integration step for any stop routes that are not active during the 8-9 am period though but those can be filtered out by matching the data in the rest of the gtfs feed object and the headway dataframe in the object: https://github.com/UDST/urbanaccess/blob/dev/urbanaccess/gtfs/gtfsfeeds_dataframe.py or by filtering out null headways in the final integrated network before use.

  2. If you calculate headways using this: https://udst.github.io/urbanaccess/data_loaders.html#urbanaccess.gtfs.headways.headways you get a table returned that has all the stop routes and corresponding headway statistics such as mean, min, max, standard deviation. You can modify these values in that table as you please and can add new columns with the modified values. Then when you create your integrated network https://udst.github.io/urbanaccess/data_loaders.html#urbanaccess.gtfs.headways.headways you can just specify the column you wish to use from that table in the headway_statistic parameter. It should work fine.

    If you wish to generate headways from the raw data using a custom distribution, this is not currently supported in urbanaccess although this is on our roadmap and has been requested before. However, that said this can be done by changing the code inside of headways.py: you can apply your own calculation over each pandas.series() that holds the time difference between arriving vehicles on each route stop here: https://github.com/UDST/urbanaccess/blob/dev/urbanaccess/gtfs/headways.py#L54-L55. As you can see what is used by default is a pandas.describe() on each stop route group to build the min, max, mean, standard deviation headway calculation table. You can store your new results in the same dataframe format that is generated here by default which is utilized when you run this function: https://udst.github.io/urbanaccess/data_loaders.html#urbanaccess.gtfs.headways.headways. In this function you would just set the headway_statistic parameter to be the name of the column in that dataframe you want to use and it should work fine.

    It should be straightforward to modify the code in that section to do so. We welcome any PRs for any additions like this you think may be generalizable. Or even if you try to do this but don’t want to make a PR please feel free to share code snippet examples, a full notebook example, or a fork with code that attempts to do what you want to do here and we can see if we can get that feature added in a shorter timeframe.

  3. The specific time points functionality for pandana you are talking about has also been widely requested. You are correct that this is specific to how pandana is constructed and operates so this would have to be a feature added to pandana in order to utilize the urbanaccess graphs in such a manner. I have reviewed this in more detail with @federicofernandez and we think in general that pandana was not built for this particular use case and would require some significant investment in time to add this type of functionality. So we would say that this is not currently in our short term priority list for pandana but of course we always welcome PRs and if this is something that interests you we would welcome the opportunity to have help in adding this type of feature to pandana and can work with you to advise you on this regard.

sablanchard avatar May 21 '19 18:05 sablanchard

Hi @sablanchard,

This echos what I thought I could do based on how headway are computed. The fact they are accessible in a table is very convenient, and thanks for outlining where to check.

A potential PR I am thinking might be helpful is an "access_range" calculation where someone could specify a percent range they want to modify the headway tables for "low end" vs. "high end" access values. So 0 and 100 as parameters would yield the min (0 - arrived on time) and the max (equal to the headway), where as a percent passed would be some percentage of that.

We are weighing different solutions here, and if we go in this direction I will let you know. A PR is possible, and we have contributed tools as part of transit projects in the past.

Thanks for the in depth run down and road map status. I can understand how specific time points would be difficult given Pandana's architecture. This all makes sense. I will come back with clarifying questions.

d-wasserman avatar May 21 '19 19:05 d-wasserman