twitterscraper icon indicating copy to clipboard operation
twitterscraper copied to clipboard

Error INFO USER-AGENT

Open TrungKien1230 opened this issue 5 years ago • 4 comments

Hello @taspinar, When I code like this:

from twitterscraper import query_tweets

list_of_tweets = query_tweets('HRTechConf', begindate=datetime.date(2019, 9, 26), enddate=datetime.date(2019, 10, 6), lang='en')

tweets_df = pd.DataFrame([vars(x) for x in list_of_tweets])

And I have this result, could you help me, please

INFO: {'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 6.1; x64; fr; rv:1.9.2.13) Gecko/20101203 Firebird/3.6.13'} Traceback (most recent call last): File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\contrib\pyopenssl.py", line 456, in wrap_socket cnx.do_handshake() File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\OpenSSL\SSL.py", line 1915, in do_handshake self._raise_ssl_error(self._ssl, result) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\OpenSSL\SSL.py", line 1647, in _raise_ssl_error _raise_current_error() File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\OpenSSL_util.py", line 54, in exception_from_error_queue raise exception_type(errors) OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'tlsv1 alert access denied')]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen chunked=chunked) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 343, in _make_request self._validate_conn(conn) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 839, in validate_conn conn.connect() File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connection.py", line 344, in connect ssl_context=context) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\util\ssl.py", line 345, in ssl_wrap_socket return context.wrap_socket(sock, server_hostname=server_hostname) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\contrib\pyopenssl.py", line 462, in wrap_socket raise ssl.SSLError('bad handshake: %r' % e) ssl.SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'tlsv1 alert access denied')])",)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\adapters.py", line 449, in send timeout=timeout File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen _stacktrace=sys.exc_info()[2]) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\urllib3\util\retry.py", line 399, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='free-proxy-list.net', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'tlsv1 alert access denied')])")))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:/Users/nguy/PycharmProjects/Streaming_Tweets_Data/twitterScraper.py", line 5, in from twitterscraper import query_tweets File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\twitterscraper_init_.py", line 13, in from twitterscraper.query import query_tweets File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\twitterscraper\query.py", line 72, in proxies = get_proxies() File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\twitterscraper\query.py", line 42, in get_proxies response = requests.get(PROXY_URL) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\api.py", line 75, in get return request('get', url, params=params, **kwargs) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\api.py", line 60, in request return session.request(method=method, url=url, **kwargs) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\sessions.py", line 533, in request resp = self.send(prep, **send_kwargs) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\sessions.py", line 646, in send r = adapter.send(request, **kwargs) File "C:\Users\nguy\AppData\Local\Continuum\anaconda3\lib\site-packages\requests\adapters.py", line 514, in send raise SSLError(e, request=request) requests.exceptions.SSLError: HTTPSConnectionPool(host='free-proxy-list.net', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'tlsv1 alert access denied')])")))

TrungKien1230 avatar Nov 06 '19 10:11 TrungKien1230

Hi @TrungKien1230 , have you solved this issue? I"m having the same problem. I first thought it was due to trying to scrape too many tweets at one go. So I waited for few days and tried again, but this error is still persistent. It'd be nice if you could share the solution for this! thanks

Makoto1021 avatar Jan 04 '20 15:01 Makoto1021

@Makoto1021 @TrungKien1230 Have you been able to find a way to solve this? It was working last week no problem, but I get the same error now

avanibhatnagar avatar Feb 17 '20 16:02 avanibhatnagar

same issues here

JoeCarlPSU avatar Mar 02 '20 22:03 JoeCarlPSU

As you can read on Policy of Twitter API. You have 2 ways to retrieve tweets: track in the pass (7 days) and streaming (right now) and you cannot retrieve over amount of tweets in a unit of time. With the streaming you can take as much as you want. This problem is because you took too much already.

You can read my code here, the code for retrieving tweets in the pass:

Firstly, you need to take ACCESS_TOKEN, ACCESS_TOKEN_SECRET, CONSUMER_KEY, CONSUMER_SECRET from twitter and put it in twitter_credentials.py to mach with my code

import tweepy import twitter_credentials import json import csv import datetime

class rest_api: """ This class for collecting the tweets in the pass (7days max) """ def init(self): # Authenticate to Twitter self.auth = tweepy.OAuthHandler(twitter_credentials.CONSUMER_KEY, twitter_credentials.CONSUMER_SECRET) self.auth.set_access_token(twitter_credentials.ACCESS_TOKEN, twitter_credentials.ACCESS_TOKEN_SECRET) # Create API object self.api = tweepy.API(self.auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

def get_tweets(self, file_name, query, lang, date_time , max_tweets):
    # tweets = api.search(q=query, lang=lang, count=max_tweets, until=2019 - 11 - 6)
    # csvFile = open('res_api_csv', 'a')
    # csvWriter = csv.writer(csvFile)

    for i in range(len(query)):
        # cursor = tweepy.Cursor(self.api.search, q = query[i], lang = lang, until = date_time).items(max_tweets)
        for tweet in tweepy.Cursor(self.api.search, q = query[i], lang = lang, until = date_time).items(max_tweets):
            print(f'{tweet}')
            with open(file_name, 'a') as tf:
                tf.write(json.dumps(tweet._json) + '\n')
            # csvWriter.writerow([
            #     tweet.contributors, tweet.coordinates, tweet.created_at, tweet.entities,
            #                     tweet.favorite_count, tweet.favorited, tweet.geo,
            #                     tweet.id, tweet.id_str, tweet.in_reply_to_screen_name, tweet.in_reply_to_status_id,
            #                     tweet.in_reply_to_status_id_str, tweet.in_reply_to_user_id, tweet.in_reply_to_user_id_str,
            #                     tweet.is_quote_status, tweet.lang, tweet.metadata, tweet.place, tweet.possibly_sensitive,
            #                     tweet.retweet_count, tweet.retweeted, tweet.source,
            #                     tweet.text.encode('utf-8'), tweet.user])
    # csvFile.close()

q = ["renault twingo3", "renault twingo 3", "renault twingo III", "renault twingoIII", "renault clio5", "renault clio 5", "renault clio V" , "renault clioV", "renault arkana", "renault zoe2", "renault zoe 2", "re nault zoeII", "renault zoe II", "renault zoé2", "renault zoé 2", "renault zoé II", "renault zoéII", "renault captur2", "renault captur 2", "renault captur II", "renault capturII"]

rest_api = rest_api() rest_api.get_tweets(file_name = 'rest_api_csv.csv', query = q, lang = '', date_time = datetime.date.today(), max_tweets = 500)

TrungKien1230 avatar Mar 04 '20 15:03 TrungKien1230