twitter-scraper
twitter-scraper copied to clipboard
JSONDecodeError + 400 error when using get_tweets()
Code used:
import twitter_scraper
for tweet in twitter_scraper.get_tweets('twitter', pages=1):
print(tweet)
Traceback:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/foobar/code/temp/twitter-scraper/twitter_scraper/modules/tweets.py", line 166, in get_tweets
yield from gen_tweets(pages)
File "/home/foobar/code/temp/twitter-scraper/twitter_scraper/modules/tweets.py", line 37, in gen_tweets
html=r.json()["items_html"], url="bunk", default_encoding="utf-8"
File "/home/foobar/code/temp/twitter-scraper/.venv/lib/python3.6/site-packages/requests/models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The response from Twitter:
url: https://twitter.com/i/profiles/show/twitter/timeline/tweets?include_available_features=1&include_entities=1&include_new_items_bar=true
status_code: 400
text: ''
Perhaps this is related to the redesign mentioned in https://github.com/bisguzar/twitter-scraper/issues/132?
same issue, returns null html
Same issue, returns 400
Same Issue here
Same issues too get_trends() and get_tweets()
https://twitter.com/i/search/timeline doesn't work either
Twitter just updated something. We need debug it entirely. But I don't have any time in mey short-term. All informations are welcome
My hunch is this is the same issue raised in #132. Would updating the "headers" called in tweets.py solve the issue?
I don't think so, just modified headers a bit but nothing changed. As I said, I'm not able to debug this issue in short-term because of my busy schedule. Please change headers as you wish too, modify source and tell us what happened. @EssbieWGT
Maybe is for this
"It seems that Twitter has had it enough! The company is shutting down its original site legacy theme version on the 1st of June 2020, as reported by BleepingComputer. Twitter has issued a warning to all the users who have been using user-agent switching hacks and unsupported browsers to enable the legacy theme."
https://www.digitalinformationworld.com/2020/05/twitter-issues-warning-to-shut-the-site-s-legacy-theme-once-and-for-all-in-june-2020.html#:~:text=It%20seems%20that%20Twitter%20has,to%20enable%20the%20legacy%20theme.
started working now
doesn't work for me
I'm so confused. Just tried with version 0.4.1 and it seems working. Don't know how yet. But need more information. Look like twitter trying something new. Just tried for get_tweets() by the way. Didn't see any problem on profile and get_trends.
Spent some time trying to find the problem over the weekend, and couldn't nail it down. Ended up creating a new virtual environment for my script and now everything works fine.
So weird, thanks for your efforts @EssbieWGT . I tried inside my old environment and same result, working... I'm not going to close this issue for a while. We need to deep-into search.
Is problem back or is it just me?
Back to 400 bad request.
Does not work
NOT work too
Kinda lost at what can be done here.
Browsing through devtools in Firefox only this brings attention, since it returns json with tweets: https://api.twitter.com/2/timeline/profile/25073877.json But i can't seem to use it inside Python script, access is forbidden.
Another way is to use Pyppeteer with
text = await page.evaluate('''() => {
return document.all[0].outerHTML;
}''')
But that would be html code with (encrypted?) class names that's a pain in the ass to sort out.
Any ideas?
@brachna That link doesn't even work in browser for me
Seems like it has to do with a change in the Twitter API (v2), I can see that right now you cannot view tweets without logging in.
Ok, so gallery-dl has twitter extractor that uses twitter api (v2). It does work. However, it has rate-limit. Also one of the tweets I used for testing didn't have its media returned, even though it can be viewed in a browser.
Confirming that this error is still an issue, same code and traceback as OP comment.
Any updates about this issue?