osu-api
osu-api copied to clipboard
Unable to retrieve all ranked/loved maps using "standard" download flow
Cause being that loved maps, in one "loved batch", have same exact approved_date.
reproduce steps: call: https://osu.ppy.sh/api/get_beatmaps?since=2018-06-11%2019:40:06&k=key several loved maps are missing in that call (notice how all maps at the end end with the exact same approved_date) eg. mapId:958038
standard flow is to take the last approved_date
and re-fire request with updated since
param until less than 500 maps appear.
I've already created an iffy workaround for osustats, but I thought I would report this anyway.
Retrieving that beatmap's information directly with https://osu.ppy.sh/api/get_beatmaps?k=KEY&b=958038 gives
...
"approved_date": "2018-06-25 02:05:26",
...
which is off from what the website says (loved on 24 June 2018
). This is possibly an osu-web issue rather than an osu-api issue.
Whoops, I see my mistake. Website is in UTC-4 (for me) and api responses are in UTC. Disregard my response.
Are you saying that over 500 maps were "loved" in one go?
No, not 500, but by getting maps from the first ranked map ever you end up with URL that will only contain fraction of loved maps. When calling(I guess that's missing in my steps) next get_beatmaps request after one I mentioned, you will end up skipping some of these due to them having same dates.
If there's not 500 beatmaps with the same date, isn't calling with 500 pagination going to work okay?
Not necessarily. Let's use link from my first response as an example:
- You fetch that link contents->500maps
- Extract last(or max) approved_date. In this case
2018-06-25 02:05:26
- Create new request with date above and get results->500maps since that date, even though there are more maps with that exact date that weren't included in previous request(because of 500 limit)
Thus, yes Api is working as intended, but I would highly suggest to retroactively update loved maps approved_date to not be identical in same loved batches.
And now that I think of it, same could happen for ranked maps since iirc these sets also share same approved_date.
I ran into this problem as well when trying to get ranked+loved maps, so I just subtracted 1 second from last approved_date, then got rid of duplicates, not sure if there's any better way to do this.
I just subtracted 1 second from last approved_date
That's exactly the same workaround I added to osustats before reporting this..
Either this should get somehow fixed one way or another, or wiki should have a mention of that "gotcha"
This will be fixed with proper pagination in api v2. Not sure if it will be addressed on v1.
Have you tried using v2 for your purpose?
No, I have not.
I remember mentions about live-feed of scores being added later on, is that planned at some point or I'm just misinformed? (like https://osu.ppy.sh/p/events in api form)
I also did the subtracting 1 second thing about a year ago. Better fix is instead of using since parameter which may be identical is to paginate by beatmap id which is guaranteed unique.
If you don't need the maps since the beginning of the month you can use data.ppy.sh to access the whole table at once. #193