bahn.guru icon indicating copy to clipboard operation
bahn.guru copied to clipboard

Use median instead of minimum price

Open benjaminweb opened this issue 7 years ago • 20 comments

Minimum price might mislead. See a month where most days show 19 EUR. Some days carry most fares at 19 EUR. Other days carry most fares a multiple of 19 EUR (yet there is a single weird connection of 19 EUR).

Just my bits…

benjaminweb avatar Jan 31 '18 22:01 benjaminweb

I'm sorry for not answering this in 4 months 🙈 In theory, I agree with your point, but since people are often looking for the lowest possible price, displaying the median price probably wouldn't help them. But it might make sense to have some other form of sorting, e.g. considering not only the price but the price per travel time (price maybe squared). I will think about this again.

juliuste avatar May 31 '18 09:05 juliuste

Why not show a graph per day, with a very low opacity, behind the lowest price?

derhuerst avatar May 31 '18 09:05 derhuerst

Which information should this graph display?

juliuste avatar Jun 10 '18 09:06 juliuste

A distribution of the prices of all tickets available for that day.

Also the development of the minimum price in the past days would be interesting.

derhuerst avatar Jun 10 '18 12:06 derhuerst

A distribution of the prices of all tickets available for that day. Sparklines? Simpler variant (kind of histogram with only 3 bars): Q1 (25 % quantile), Q2 (= Median), Q3 (75 % quantile) would serve that purpose, cf. [1]. Also the development of the minimum price in the past days would be interesting.

  • X% (superscript to Q1 through Q3) would denote the markup in the last 7 days (or choose a sensible period). Will this be the main decision criterion for the user? If yes, then there is nothing more important than this.

Price surge happens within days to departure — for weeks before the price remains stable.Why not vary time granularity of price display accordingly?

Just an idea: "Book no later than … days before travel to escape price inrease of x%." [1]: https://en.wikipedia.org/wiki/Interquartile_range

benjaminweb avatar Jun 26 '18 23:06 benjaminweb

I fiddled for me something together which looks like this: https://github.com/thigg/bahn.guru/commit/9f2a7766bdfe08fd48eeca9d0d8685226d95633d#commitcomment-30855700

I would submit a pull request, if you are interested. For each day, the cheapest price at each hour is shown as a graph.

image

thigg avatar Oct 10 '18 22:10 thigg

First trial balloon on bahn.jetzt.

@juliuste Is it okay to trigger concurrent requests on bahn.guru as implemented now?? Requesting permission ;-).

screen shot 2018-11-11 at 18 06 09

screen shot 2018-11-11 at 19 52 52

benjaminweb avatar Nov 11 '18 17:11 benjaminweb

@thigg Thank you very much for your work, I'm sorry that I didn't see your answer before. @derhuerst is this what you had in mind?

juliuste avatar Nov 12 '18 16:11 juliuste

@benjaminweb Looking great 😮 Feel free to PR

juliuste avatar Nov 12 '18 16:11 juliuste

Oh, sorry. Missed to link to the repo https://bitbucket.org/hyllos/bahn_preis_vtl. Any idea how to integrate this python script into JS or vice versa?

benjaminweb avatar Nov 14 '18 22:11 benjaminweb

@benjaminweb can you explain what the first graphic is depicting? I understand it as percent of prices in that pricerange.

We have quite different approaches now, maybe we should think about, how they can be added to the overall UI.

@juliuste

thigg avatar Nov 16 '18 12:11 thigg

@thigg It shows how the share of fares falling into a price bracket, that for the specific date of your train commuting.

I've just prototyped. It turned out to be a different animal than expected.

benjaminweb avatar Nov 16 '18 21:11 benjaminweb

…peeking 180 days in advance:

screen shot 2018-11-18 at 07 51 56

benjaminweb avatar Nov 18 '18 06:11 benjaminweb

Current State

https://bahn.jetzt/$variant/$startStationId/$stopStationId/$daysAhead and https://bahn.jetzt/$variant/$startStatioName/$stopStationName/$daysAhead return embeddable svg

Examples (100 days in advance Hamburg -> Munich)

Preis: http://bahn.jetzt/preis/Hamburg/München/14

screen shot 2018-11-26 at 02 04 02

Dauer: http://bahn.jetzt/dauer/Hamburg/München/14

screen shot 2018-11-26 at 02 03 07

EUR/min: http://bahn.jetzt/gewichtet/Hamburg/München/14 screen shot 2018-11-26 at 02 01 23

code lives at https://bitbucket.org/hyllos/bahn_preis_vtl

Possible Integration

=> bahn.guru calls bahn.jetzt with stationIds or stationNames and embeds svg


TODO/IDEAS

o subclass group to view hours in description of hover o include booking links o extend API by further options o tests: currently only partly o flixbus? o overview of top 10 relations? o shorten loading time (it's already asynchronous): how? o group durations per price {105.9: {'5:40', '5:41', '6:04', '6:36', '6:14', '5:51', '5:59', '6:47', '6:30', '5:42', '5:48', '5:45', '5:39', '5:37', '6:22', '5:44', '6:19', '6:25'}, 125.9: {'9:38', '6:30', '11:29', '5:48', '6:22', '6:24', '6:09', '5:41', '5:59', '9:21', '5:44', '7:05', '5:40', '5:38', '6:19', '6:25', '6:04', '6:14', '5:42', '10:01', '5:37', '5:39', '5:45', '11:30'}, 133.9: {'6:04', '5:41', '6:14', '5:59', '5:42', '9:21', '5:45', '5:40', '6:19'}, 139.9: {'6:04', '5:41', '6:14', '5:42', '5:37', '5:45', '5:44', '7:05', '5:40', '11:30'}, 150.0: {'6:04', '5:41', '6:14', '9:21', '5:42', '11:19', '5:48', '5:45', '5:39', '5:37', '6:22', '10:01', '5:44', '7:05', '5:52', '11:16', '5:40', '11:30'}, 75.9: {'11:08', '5:40', '5:41', '9:38', '10:18', '10:21', '9:21', '5:42', '5:48', '5:45', '10:01', '11:41', '6:22', '5:37', '5:44', '5:39', '8:19', '6:19'}, 89.9: {'11:08', '6:30', '5:48', '6:09', '5:41', '5:53', '9:21', '5:44', '7:05', '10:30', '5:40', '9:23', '10:21', '5:43', '6:38', '5:52', '6:19', '6:25', '5:42', '6:18', '10:01', '5:45', '5:39', '5:37'}, 157.5: {'5:53', '5:42', '7:05', '11:30'}, 115.9: {'5:42'}, 67.9: {'5:41', '6:30', '5:42', '5:44'}, 29.9: {'5:41'}, 45.9: {'5:41', '9:21'}, 59.9: {'5:41', '5:42'}, 25.9: {'5:41', '9:21'}, 47.9: {'5:41', '9:21'}, 95.9: {'5:41'}, 19.9: {'6:14'}, 49.9: {'5:40'}}

Recent Changes

o FIX: now display actual count of relations. o show price, duration and duration weighted price, see links above (addressing concern by @juliuste) o https://bahn.jetzt allows stationIds AND stationNames (powered by bahn-station-api) o y-axis: matches now absolute relation count (instead of %) o error message if unknown station specified o bahn.guru dependency: released -> directly accessing bahn.de ~~o stationIds: (temporarily) instead of names~~ o colours: switch to CleanStyle o bars: make absolute (relations per day) instead of (percent of relations per day) o relations too early to book: skip them o legend labels: shortened

Discussion: False positives mixed with positives

Problem statement:

scenario 1: 10 hour connection, price: EUR 100 scenario 2: 5 hour connection, price: EUR 50

weighted approach: produces same number for both: 10 EUR/hr

False positives: a. high price, high hours => user not interested b. normal price, high hours => user not interested

Positives: c. low price, normal hours => user interested

I. determine (A) low & (B) normal hours corridors II. determine low price corridor III. intersect: pick those relations within (A), (B) and II. IV. highlight matches from III.


simplify: => toss high hours.

benjaminweb avatar Nov 25 '18 00:11 benjaminweb

@juliuste => How can we take this forward?

benjaminweb avatar Nov 25 '18 21:11 benjaminweb

Great work 👍

It would be really cool to have this in the /calendar view of bahn.guru (and maybe also in the day view, showing hours instead of days). However - at least for the calendar - we should discuss if people should be able to switch between the "normal" view and this one or if we should just display both (e.g. the new view above the calendar, like on flight websites).

We also need to check how this looks on mobile.

I already have one request, though 😄 Could we move the diagram key from the left to the bottom of the chart (or the top) and maybe reduce the height of the y-axis a little so that we have something with an aspect ratio closer to 3:1 rather than 3:2 (would make it easier to add the diagram above/below the current calendar).

juliuste avatar Dec 01 '18 23:12 juliuste

Thanks for your feedback.

Let's rethink architecture before creating a chaos ;-):

o get_prices: factor out into dedicated prices API o plot_chart o draw_calendar

That prices API would enable others to create things we do not even dream of. Can't say when I will devote my time on the prices API.

btw: would it be a prices API or a Sparpreis API only?

benjaminweb avatar Dec 02 '18 10:12 benjaminweb

Update

ø get_prices: factor out into dedicated prices API ø plot_chart: renewed ø draw_calendar (segment): all routes (except /stationId) return HTML by default, json if Accept: application/json is part of headers

  • first blueprint of prices API finished; code & minimal documentation lives at https://bitbucket.org/hyllos/sparpreis-api
  • https://bahn.jetzt/chart?tag=2019-01-10&von=Hamburg&nach=München&vorschau=50 screen shot 2019-01-04 at 16 38 09
  • connections behind a bar segment: https://bahn.jetzt/preisSpanne?tag=2019-02-03&vorschau=0&von=Hamburg&nach=München&untererPreis=68.0&obererPreis=160
  • integrated route /günstige into /fahrten in favour of QueryFlag &günstige

=> what's next?

Conceptualise entry page with search boxes, similar to bahn.guru's root page.

What's the coverage of the connections? All or only sparpreise?

benjaminweb avatar Jan 04 '19 03:01 benjaminweb

Update: Version 0.1.8.1 of sparpreis-api

https://bahn.jetzt

  • [x] MVP entry page with search box added: https://bahn.jetzt, repo: https://bitbucket.org/hyllos/sparpreis-api
  • [x] implement error handling
  • [x] make time frame configurable
  • [x] select cheapest connections based on specific price per travel duration (and differentiate this with increasing travel duration)
  • [x] implement BahnCard config
  • [x] streamline landing page (not perfect but done)
  • [x] fix "buy ticket" links (in particular for Berlin Hbf (tief))
  • [x] fix fahrten route: returns now aggregate of days if single day does not return connection

TODO:

  • [ ] document REST-API with servant-swagger
  • [ ] autocomplete for von and nach
  • [ ] fancy calendar
  • [ ] error handling for days input

benjaminweb avatar Apr 15 '19 12:04 benjaminweb

@juliuste ”since people are often looking for the lowest possible price, displaying the median price probably wouldn't help them. But it might make sense to have some other form of sorting, e.g. considering not only the price but the price per travel time (price maybe squared). I will think about this again.“ @derhuest That graph thing seems to go off course…

What about showing the cheapest price with its (shortest) travel time and the shortest travel time with its (cheapest) price?

nepumuk-fs avatar Sep 24 '21 08:09 nepumuk-fs