open-bus icon indicating copy to clipboard operation
open-bus copied to clipboard

Discussion about a web app for open-bus

Open OrBin opened this issue 5 years ago • 12 comments

Starting a discussion here, based on notes from the discussion in 11/02/19.

Features & specification

  • Exporting selected data

"Customers"

  • Private people
  • Organizations and non-profits

Queries

  • A single trip
  • Multiple trips
  • Lines comparison / alternatives display
  • Daily/monthly/custom filter for a line
  • Geographic filtering (selecting an area on map, selecting a city)
  • Stops selection for a line vs expected time
  • Count of problems (?) for a line
  • Average travel time between stops
  • Line frequency

Data for each query

  • Accuracy level
  • Delays count
  • Missing trips count
  • Trip + skipping/not arriving to stops
  • Trip + delay for a stop

Technology

Front End

  • Leaflet (#151)
  • React/Angular?
  • Developing for desktop first. PWA for mobile?

Back End

  • Python (Django/Flask) vs NodeJS vs Java
  • logs?

DB

  • SQL/MongoDB/elasticsearch?
  • elasticsearch as a main DB?
  • If using SQL - maybe PostgreSQL with PostGIS?

OrBin avatar Feb 18 '19 18:02 OrBin

@cjer

OrBin avatar Feb 18 '19 18:02 OrBin

Is there any GIS solution in Elasticsearch or Mongo? it's really easy and efficient to perform geographic queries with PostGIS, there is also data type MULTIPOINT that could be the solution for the shape of a route

AvivSela avatar Feb 18 '19 19:02 AvivSela

@AvivSela I'm not a database expert but I'll try to answer from the little I do know. Elasticsearch has geographic queries and geographic datatypes (point and shape). AFAIK it's known to be one of the advantages of ES. I recently found out that mongo also has geospatial queries and types, but I know nothing about it.

OrBin avatar Feb 18 '19 21:02 OrBin

Updates about decisions from today's discussion:

  • React for FE
  • NodeJS for BE
  • No decision yet about DB. Considering elasticsearch and Postgres.

Edit: after a short research about ORMs for Postgres in NodeJS, Sequelize and Bookshelf should be considered, with PostGIS support in mind.

OrBin avatar Feb 25 '19 21:02 OrBin

Some more descriptive use cases, we need mockups for these.

  1. Investigation of a specific case:

    1. Goal - Investigate and drill down on a single transport case, find out whether bus was no-show/late/early, bus skipped stop or took a wrong route, weird real-time data in apps or electronic board.
    2. Inputs -
      1. route and origin_aimed_departure_time (if the user knows it). Users don't know about route_ids, we need to consider all alternatives (using route_mkt). Help the user with the agency name, a calendar to choose from, the city of origin and destination.
      2. a stop location and a time range, and an optional route - this is in case a user doesn't know the trip departure time. This is mostly the case if we take into account "dirty" data, with wrong origin_aimed_departure_time. Also in cases when the user wants to be super sure, to see whatever happened around the stop.
    3. Outputs -
      1. Show all related siri trip shapes on map.
      2. Show GTFS stops and shapes for the relevant routes.
      3. Group and color by bus_id or trip_id or origin_aimed_departure_time
      4. Show departure and arrival time to stops
      5. Numbers: how late on departure, how late to last stop, how many coordinates, average speed, top speed
  2. Investigation of a route or a few routes in specified time-ranges (hour(s), day(s), week(s), month(s)):

    1. Goal - investigate performance trends for a single route, or compare performance of a few routes.
    2. Inputs - route/s and time range
    3. Outputs -
      1. statistics about bus trip no-show/late/early (binned based on time range)
      2. anomalous route shapes (with examples on map)
      3. average speeds (by date and colored on map)
      4. statistics about number of siri responses we have and their average and worst frequency
  3. Investigation of a road segment

    1. Goal - investigate performance trends of bus trips on a road segment
    2. Inputs - road segment (choose on map), route filter, time range
    3. Outputs - Trends of speed, congestion, number of trips, number of stops
  4. Other use cases:

    1. Data Export
    2. Actual schedule / stop_times
    3. Connectivity - bus-bus, bus-train, train-bus
    4. Make Merhav tools based on real-time http://miu.org.il/miu/MIU_v4/docs_activities/transportation/transitanalyst/indexh.html

cjer avatar Mar 04 '19 15:03 cjer

@cjer, a few questions/thoughts about the use cases:

1.ii.b: "and an optional route" - what do you mean? How do you think the user can give a route as an input? Maybe a destination city/stop, or a stop in (middle of) the route? Same question for 2.ii and 3.ii ("route filter").

1.iii.e: what do you mean by "how many coordinates?" Why would someone be interested in it?

2.ii: IMO "time range" should be something like a choice between Today/Yesterday/Last week/Last month/Custom (also for 3.ii).

2.iii.c: I think they should also be by weekday.

2.iii.d: How do you plan to use it?

3.ii: How can the user choose a road segment? I can think of selecting two points and taking the path between them, but it will get more complex if they're not on the same road...

4.iii: Where do we get congestion data from? Did you mean density of buses?

OrBin avatar Mar 04 '19 17:03 OrBin

Today I've talked to Harel Mazor from Israel Hiking Map about databases.

They decided to use elasticsearch, mainly because of its advanced textual search capabilities, but later discovered that it has good geospatial capabilities. I don't think the textual search capabilities are relevant for us, since any textual search we'll want is address search, which can and should be performed by a geocoder (in OSM or Google Maps).

Harel also noted that elasticsearch is easy and comfortable to work with since it's fast and NoSQL, and easily work on both Linux and Windows. On the other side, he generally dislikes relational databases and it may be part of the cause he did not want to use PostGIS.

I think PostGIS has an advantage in our case, since our data is relational (at least, that's what I understood to this point). Harel noted that he felt that using PostGIS was hard to work with on Windows (he started the project about 5 years ago, so it may be outdated but anyway worth a double-check).

OrBin avatar Mar 04 '19 18:03 OrBin

Tasks separation from today's discussion:

  1. Define the API between the FE and the BE, including response examples - Should be done before the hackathon.
  2. Create the BE as a REST API server returning mock data.
  3. Use the mock to create a FE and display the data on a map.
    1. Make the GTFS and SIRI data be inserted into the database efficiently. This task may include modifying gtfs_stats.
    2. Make the BE actually query the DB.

OrBin avatar Mar 04 '19 20:03 OrBin

API definition (draft)

Case 1

GET /transport?routeId=<routeId>&originAimedDepartureTime=<time> Returns a single transport

GET /transport?stopLocation=<location>&startTime=<startTime>&endTime=<endTime> Returns multiple transports (all transports that fit the condition)

GET /transport?stopLocation=<location>&startTime=<startTime>&endTime=<endTime>&routeId=<routeId> Returns multiple transports (all transports that fit the condition)

Case 2

GET /routeStats?routeIds=<routeIds>&startTime=<startTime>&endTime=<endTime>

OrBin avatar Mar 11 '19 18:03 OrBin

changed transport into trip and added path for get trip by identifier

Trips API definition (draft)

GET v1/trips/{trip identifier} Returns single trip

GET v1/trips?routeId=<routeId>&originAimedDepartureTime=<time> Returns a single trip

GET v1/trips?stopLocation=<location>&startTime=<startTime>&endTime=<endTime> Returns multiple trips (all trips that fit the condition)

GET v1/trips?stopLocation=<location>&startTime=<startTime>&endTime=<endTime>&routeId=<routeId> Returns multiple trips (all trips that fit the condition)

AvivSela avatar Mar 11 '19 20:03 AvivSela

Created mock for the backend - see #160

OrBin avatar Mar 18 '19 19:03 OrBin

Here is a transcription of a lecture in Open Source GIS 2019 conference. Pages 4-6 has some useful information about DBs for GIS.

OrBin avatar Jul 13 '19 07:07 OrBin