UserWarning: Not all PushShift shards are active. Query results may be incomplete
Hi,
This morning I got the following error message:
UserWarning: Not all PushShift shards are active. Query results may be incomplete warnings.warn(shards_down_message)
I made my extra small so the size of the query shouldn't be a problem. I've noticed that api.pushshift.io doesn't return any data for the past 5 hours (so setting after=5h doesn't return any data, but after=6h does).
Just curious what the reason for this warning is and what it means for me so I can try to circumvent it.
Last note, awesome work!
Regards,
Bart
I am using PSAW as a wrapper and noticed that you can fetch the amount of shards by (see https://pypi.org/project/psaw/):
api = PushshiftAPI()
api.metadata_.get('shards')
that returned the following result
{'failed': 0, 'skipped': 0, 'successful': 2, 'total': 4}
So I am assuming I am getting this error because only two are successful out of the 4.
Is this something specific to me or is everyone facing this issue right now? Also, is this something that happens regularly?
After some more investigation I found the following comment by @pushshift:
If the successful shard count is less than the total shard count, what probably happened is that a node fell out of the cluster. This is usually always a temporary thing.
source: https://www.reddit.com/r/pushshift/comments/cqyq8t/update_all_indices_have_been_recoved/