nasty icon indicating copy to clipboard operation
nasty copied to clipboard

Problem retrieving all replies of a specific Tweet

Open lendikuku opened this issue 5 years ago • 4 comments

There has been a problem in the replies module of the nasty library. I cannot get all the replies of a certain tweet. Can you remove modify the library to include all the replies.

import nasty
import json
all_tweets=[]
counter=0
username="Imrankhanpti"
tweet_stream = nasty.Replies("1229250933525270528",max_tweets=10000,batch_size=9999).request()
try:
    for tweet in tweet_stream:
        print(tweet.id, tweet.text)
        all_tweets.append({"user": tweet.user.name, "text": tweet.text})
        counter=counter+1
        print(counter)
except:
    pass
filename = username+"_twitter.json"
print(all_tweets)
print("\nDumping data in file " + filename)
with open(filename, 'w',encoding="utf-8") as fh:
    fh.write(json.dumps(all_tweets,ensure_ascii=False))   

lendikuku avatar Feb 22 '20 14:02 lendikuku

I can replicate this issue on the command line with

nasty replies --tweet-id 1229250933525270528 --max-tweets -1 --batch-size 100 --log-level DEBUG > tweets.jsonl

Not sure when I'll have time to look into what's causing this and fix it though, sorry!

lschmelzeisen avatar Feb 23 '20 18:02 lschmelzeisen

If I increase the batch size, reply count will increase

lendikuku avatar Feb 23 '20 19:02 lendikuku

Yes, I already noticed that. However, this does not return all replies (should currently be around 2200) so I still consider this a bug.

Additionally, as I discussed in the documentation of the batch size parameter, in previous experiments I found out that setting this to 100 was best for performance.

lschmelzeisen avatar Feb 23 '20 20:02 lschmelzeisen

Hello, I am unfortunately having the same issue. I cannot retrieve all replies to certain Tweet-IDs but only a significantly smaller amount. I have already tried adapting the batch and maximum Tweets size (max_tweets=10000, batch_size=9999), which has slightly increased the amount of replies I can retrieve, however, it still does not return all replies. Is there any solution yet on how to circumvent this bug?

Exemplary Tweet-IDs: Tweet-ID 1325767629890592771 retrieves 196 replies instead of 1657. Tweet-ID 1329032586421805056 retrieves 204 replies instead of 572. Tweet-ID 1302360844882391041 retrieves 188 replies instead of 538. Tweet-ID 1326574496501944321 retrieves 202 replies instead of 1247. Tweet-ID 1245138825480941573 retrieves 185 replies instead of 2165. Tweet-ID 1308132903830925313 retrieves 187 replies instead of 7537.

I would appreciate any helpful tips. Thank you in advance.

Rebecca23A avatar Dec 05 '20 09:12 Rebecca23A