twitter-scraper-selenium Not scraping every tweet from a user

Not scraping every tweet from a user

Open wjd157 opened this issue 1 year ago • 6 comments

Hello, I am trying to scrape every tweet from a user. From the twitter page, I can see that they have tweeted more than 5000 times. However, even when I set my tweets_count to 5000, I am getting less than 1000 tweets from that user.

My code is below:

scrape_profile(twitter_username = "elonmusk", output_format ="csv", tweets_count = 6000, browser = "chrome", filename = "elonmusk")

(Note that @elonmusk is just a stand-in example)

Nov 28 '22 15:11 wjd157

Hey @wjd157, that method uses browser automation for scraping and your tweet count is big so it might be getting blocked in between. I suggest you use the scrape_keyword_with_api() method for scraping. Try the below code, and check elon.json after scraping you will get the data you want

from twitter_scraper_selenium import scrape_keyword_with_api

scrape_keyword_with_api('from:elonmusk', output_filename='elon')

Dec 03 '22 04:12 shaikhsajid1111

This appears to generate a JSON file with no data in it. Further, it the console tells me I have only scraped 24 tweets even though the account I am now trying has more than 200 tweets.

Dec 13 '22 19:12 wjd157

Okay, I think this feature of Twitter only returns few tweets. Currently, I have not added feature to scrape Twitter account from Twitter's API, and the one with the browser automation get's blocked. I will add a new feature to scrape Twitter's profile from the API in a couple of weeks

Dec 14 '22 02:12 shaikhsajid1111

I am also highly looking forward to this feature. Please let us know once you had time to implement this. Thanks a lot.

Dec 24 '22 19:12 christianmettri

Hi @christianmettri @wjd157 , Just updating you about it, don't know if you're still looking for the solution. Now, you can try

from twitter_scraper_selenium import scrape_profile_with_api

scrape_profile_with_api('elonmusk', output_filename='musk', tweets_count= 100)

and check musk.json file where the output will be saved

Dec 31 '22 06:12 shaikhsajid1111

Hello @shaikhsajid1111 I tried this code and it gives me this error:

2023-02-28 02:33:09,836 - WARNING - Failed to make request!

The code:

from twitter_scraper_selenium import scrape_profile_with_api
import json

scrape_profile_with_api(username="NASA", output_filename="NASA", browser="firefox",tweets_count=50, output_dir="C:/Users/Braulio/Desktop/web scraping python")


with open('NASA.json') as f:
    NASA = json.load(f)


with open('NASAimages.html', 'w') as f:
    f.write('<html>\n')
    f.write('<head>\n')
    f.write('<title>Imágenes</title>\n')
    f.write('</head>\n')
    f.write('<body>\n')
    for tweet_id, tweet_data in caro.items():
        if tweet_data['username'] == 'NASA':
            for imagen in tweet_data['images']:
                f.write('<img src="{}" format=jpg&name=medium" alt="">\n'.format(imagen))
    f.write('</body>\n')
    f.write('</html>\n')

print("HTML READY")

I also tried with the function scrape_keyword_with_api, here is the code:


from twitter_scraper_selenium import scrape_keyword_with_api
import json

scrape_keyword_with_api(query="from:NASA", output_filename="NASA", tweets_count=50, output_dir="C:/Users/Braulio/Desktop/web scraping python")


with open('NASA.json') as f:
    NASA = json.load(f)


with open('imagenes.html', 'w') as f:
    f.write('<html>\n')
    f.write('<head>\n')
    f.write('<title>Imágenes</title>\n')
    f.write('</head>\n')
    f.write('<body>\n')
    for tweet_id, tweet_data in NASA.items():
        if tweet_data['username'] == 'NASA':
            for imagen in tweet_data['images']:
                f.write('<img src="{}" format=jpg&name=medium" alt="">\n'.format(imagen))
    f.write('</body>\n')
    f.write('</html>\n')

print("HTML READY")

It shows this error:

2023-02-28 02:37:18,021 - twitter_scraper_selenium.keyword_api - WARNING - Failed to make request!

Feb 28 '23 08:02 SenninOne

twitter-scraper-selenium twitter-scraper-selenium copied to clipboard

Not scraping every tweet from a user

twitter-scraper-selenium
twitter-scraper-selenium copied to clipboard