jikan-rest
jikan-rest copied to clipboard
Duplicated data in request
I try to get anime data with postman using this request : https://api.jikan.moe/v4/anime?q=shingeki+no+Kyojin&type=tv&min_score=8
who give me successfully 6 differents anime:
- Shingeki no Kyojin
- Shingeki no Kyojin Season 3
- Shingeki no Kyojin Season 2
- Shingeki no Kyojin: The Final Season
- Shingeki no Kyojin: The Final Season Part 2
- Shingeki no Kyojin Season 3 Part 2
Now If i change the request to not get all the anime but only one by one using pagination (with page and limit args)
https://api.jikan.moe/v4/anime?type=tv&min_score=8&q=shingeki+no+Kyojin&limit=1&page=1
- Shingeki no Kyojin
https://api.jikan.moe/v4/anime?type=tv&min_score=8&q=shingeki+no+Kyojin&limit=1&page=2
- Shingeki no Kyojin: The Final Season
https://api.jikan.moe/v4/anime?type=tv&min_score=8&q=shingeki+no+Kyojin&limit=1&page=3
- Shingeki no Kyojin Season 2
https://api.jikan.moe/v4/anime?type=tv&min_score=8&q=shingeki+no+Kyojin&limit=1&page=4
- Shingeki no Kyojin: The Final Season
https://api.jikan.moe/v4/anime?type=tv&min_score=8&q=shingeki+no+Kyojin&limit=1&page=5
- Shingeki no Kyojin: The Final Season Part 2
https://api.jikan.moe/v4/anime?type=tv&min_score=8&q=shingeki+no+Kyojin&limit=1&page=6
- Shingeki no Kyojin: The Final Season Part 2
(The order and the item duplicated can be different sometimes)
As we can see there is duplicate data and I think it's not normal ? Unless I didn't understand something.
Can you help me ?
Sorry for the late response. I think this is a caching error on the load balancer given the data is being coming from multiple instances of Jikan. But I don't understand why the ordering would be different. 🤔
I will take a detailed look into this soon. Good find!
Currently : https://api.jikan.moe/v4/anime?q=sao&sfw Return 2 times id 55994. Do we need to manually filter duplicate because 1.5 year later we still got this issue
I've triaged this the best I could: I couldn't reproduce this behaviour locally and my in own self-hosted instance. This is a cache/infrastrucutre issue I believe. In case of the public api there are multiple workers, and there can be race conditions, where two worker is trying to scrape the same anime/manga, and inserts them into mongodb. There is no locking mechanism to sync the workers. I'm going to create an issue for this and reference all the similar ones under it.