twitter-scraper JSONDecodeError + 400 error when using get

Code used:

import twitter_scraper
for tweet in twitter_scraper.get_tweets('twitter', pages=1):
    print(tweet)

Traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/foobar/code/temp/twitter-scraper/twitter_scraper/modules/tweets.py", line 166, in get_tweets
    yield from gen_tweets(pages)
  File "/home/foobar/code/temp/twitter-scraper/twitter_scraper/modules/tweets.py", line 37, in gen_tweets
    html=r.json()["items_html"], url="bunk", default_encoding="utf-8"
  File "/home/foobar/code/temp/twitter-scraper/.venv/lib/python3.6/site-packages/requests/models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The response from Twitter:

url: https://twitter.com/i/profiles/show/twitter/timeline/tweets?include_available_features=1&include_entities=1&include_new_items_bar=true
status_code: 400
text: ''

Perhaps this is related to the redesign mentioned in https://github.com/bisguzar/twitter-scraper/issues/132?

Jun 04 '20 00:06 theshoals

same issue, returns null html

Jun 04 '20 00:06 brachna

Same issue, returns 400

Jun 04 '20 02:06 icmpnorequest

Same Issue here

Jun 04 '20 04:06 peddrinn

Same issues too get_trends() and get_tweets()

Jun 04 '20 06:06 adulau

https://twitter.com/i/search/timeline doesn't work either

Jun 04 '20 06:06 brachna

Twitter just updated something. We need debug it entirely. But I don't have any time in mey short-term. All informations are welcome

Jun 04 '20 08:06 bisguzar

My hunch is this is the same issue raised in #132. Would updating the "headers" called in tweets.py solve the issue?

Jun 04 '20 13:06 EssbieWGT

I don't think so, just modified headers a bit but nothing changed. As I said, I'm not able to debug this issue in short-term because of my busy schedule. Please change headers as you wish too, modify source and tell us what happened. @EssbieWGT

Jun 04 '20 13:06 bisguzar

Maybe is for this

"It seems that Twitter has had it enough! The company is shutting down its original site legacy theme version on the 1st of June 2020, as reported by BleepingComputer. Twitter has issued a warning to all the users who have been using user-agent switching hacks and unsupported browsers to enable the legacy theme."

https://www.digitalinformationworld.com/2020/05/twitter-issues-warning-to-shut-the-site-s-legacy-theme-once-and-for-all-in-june-2020.html#:~:text=It%20seems%20that%20Twitter%20has,to%20enable%20the%20legacy%20theme.

Jun 05 '20 00:06 GivenToFlyCoder

started working now

Jun 06 '20 00:06 brachna

doesn't work for me

Jun 06 '20 04:06 d3athrow

I'm so confused. Just tried with version 0.4.1 and it seems working. Don't know how yet. But need more information. Look like twitter trying something new. Just tried for get_tweets() by the way. Didn't see any problem on profile and get_trends.

Jun 07 '20 13:06 bisguzar

Spent some time trying to find the problem over the weekend, and couldn't nail it down. Ended up creating a new virtual environment for my script and now everything works fine.

Jun 08 '20 13:06 EssbieWGT

So weird, thanks for your efforts @EssbieWGT . I tried inside my old environment and same result, working... I'm not going to close this issue for a while. We need to deep-into search.

Jun 08 '20 21:06 bisguzar

Is problem back or is it just me?

Aug 12 '20 01:08 brachna

Back to 400 bad request.

Aug 12 '20 06:08 d3athrow

Does not work

Aug 14 '20 08:08 TheMulti0

NOT work too

Aug 14 '20 08:08 skywind0218

Kinda lost at what can be done here.

Browsing through devtools in Firefox only this brings attention, since it returns json with tweets: https://api.twitter.com/2/timeline/profile/25073877.json But i can't seem to use it inside Python script, access is forbidden.

Another way is to use Pyppeteer with

text = await page.evaluate('''() => {
    return document.all[0].outerHTML;
}''')

But that would be html code with (encrypted?) class names that's a pain in the ass to sort out.

Any ideas?

Aug 16 '20 02:08 brachna

@brachna That link doesn't even work in browser for me

Aug 16 '20 02:08 d3athrow

Seems like it has to do with a change in the Twitter API (v2), I can see that right now you cannot view tweets without logging in.

Aug 18 '20 08:08 TheMulti0

Ok, so gallery-dl has twitter extractor that uses twitter api (v2). It does work. However, it has rate-limit. Also one of the tweets I used for testing didn't have its media returned, even though it can be viewed in a browser.

Aug 19 '20 06:08 brachna

Confirming that this error is still an issue, same code and traceback as OP comment.

Oct 26 '20 03:10 kdipippo

Any updates about this issue?

Apr 19 '21 01:04 LucasGobatto

twitter-scraper
twitter-scraper copied to clipboard

JSONDecodeError + 400 error when using get_tweets()

twitter-scraper twitter-scraper copied to clipboard

JSONDecodeError + 400 error when using get_tweets()

twitter-scraper
twitter-scraper copied to clipboard