twitter-scraper icon indicating copy to clipboard operation
twitter-scraper copied to clipboard

JSONDecodeError + 400 error when using get_tweets()

Open theshoals opened this issue 4 years ago • 24 comments

Code used:

import twitter_scraper
for tweet in twitter_scraper.get_tweets('twitter', pages=1):
    print(tweet)

Traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/foobar/code/temp/twitter-scraper/twitter_scraper/modules/tweets.py", line 166, in get_tweets
    yield from gen_tweets(pages)
  File "/home/foobar/code/temp/twitter-scraper/twitter_scraper/modules/tweets.py", line 37, in gen_tweets
    html=r.json()["items_html"], url="bunk", default_encoding="utf-8"
  File "/home/foobar/code/temp/twitter-scraper/.venv/lib/python3.6/site-packages/requests/models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The response from Twitter:

url: https://twitter.com/i/profiles/show/twitter/timeline/tweets?include_available_features=1&include_entities=1&include_new_items_bar=true
status_code: 400
text: ''

Perhaps this is related to the redesign mentioned in https://github.com/bisguzar/twitter-scraper/issues/132?

theshoals avatar Jun 04 '20 00:06 theshoals

same issue, returns null html

brachna avatar Jun 04 '20 00:06 brachna

Same issue, returns 400

icmpnorequest avatar Jun 04 '20 02:06 icmpnorequest

Same Issue here

peddrinn avatar Jun 04 '20 04:06 peddrinn

Same issues too get_trends() and get_tweets()

adulau avatar Jun 04 '20 06:06 adulau

https://twitter.com/i/search/timeline doesn't work either

brachna avatar Jun 04 '20 06:06 brachna

Twitter just updated something. We need debug it entirely. But I don't have any time in mey short-term. All informations are welcome

bisguzar avatar Jun 04 '20 08:06 bisguzar

My hunch is this is the same issue raised in #132. Would updating the "headers" called in tweets.py solve the issue?

EssbieWGT avatar Jun 04 '20 13:06 EssbieWGT

I don't think so, just modified headers a bit but nothing changed. As I said, I'm not able to debug this issue in short-term because of my busy schedule. Please change headers as you wish too, modify source and tell us what happened. @EssbieWGT

bisguzar avatar Jun 04 '20 13:06 bisguzar

Maybe is for this

"It seems that Twitter has had it enough! The company is shutting down its original site legacy theme version on the 1st of June 2020, as reported by BleepingComputer. Twitter has issued a warning to all the users who have been using user-agent switching hacks and unsupported browsers to enable the legacy theme."

https://www.digitalinformationworld.com/2020/05/twitter-issues-warning-to-shut-the-site-s-legacy-theme-once-and-for-all-in-june-2020.html#:~:text=It%20seems%20that%20Twitter%20has,to%20enable%20the%20legacy%20theme.

GivenToFlyCoder avatar Jun 05 '20 00:06 GivenToFlyCoder

started working now

brachna avatar Jun 06 '20 00:06 brachna

doesn't work for me

d3athrow avatar Jun 06 '20 04:06 d3athrow

I'm so confused. Just tried with version 0.4.1 and it seems working. Don't know how yet. But need more information. Look like twitter trying something new. Just tried for get_tweets() by the way. Didn't see any problem on profile and get_trends.

bisguzar avatar Jun 07 '20 13:06 bisguzar

Spent some time trying to find the problem over the weekend, and couldn't nail it down. Ended up creating a new virtual environment for my script and now everything works fine.

EssbieWGT avatar Jun 08 '20 13:06 EssbieWGT

So weird, thanks for your efforts @EssbieWGT . I tried inside my old environment and same result, working... I'm not going to close this issue for a while. We need to deep-into search.

bisguzar avatar Jun 08 '20 21:06 bisguzar

Is problem back or is it just me?

brachna avatar Aug 12 '20 01:08 brachna

Back to 400 bad request.

d3athrow avatar Aug 12 '20 06:08 d3athrow

Does not work

TheMulti0 avatar Aug 14 '20 08:08 TheMulti0

NOT work too

skywind0218 avatar Aug 14 '20 08:08 skywind0218

Kinda lost at what can be done here.

Browsing through devtools in Firefox only this brings attention, since it returns json with tweets: https://api.twitter.com/2/timeline/profile/25073877.json But i can't seem to use it inside Python script, access is forbidden.

Another way is to use Pyppeteer with

text = await page.evaluate('''() => {
    return document.all[0].outerHTML;
}''')

But that would be html code with (encrypted?) class names that's a pain in the ass to sort out.

Any ideas?

brachna avatar Aug 16 '20 02:08 brachna

@brachna That link doesn't even work in browser for me

d3athrow avatar Aug 16 '20 02:08 d3athrow

Seems like it has to do with a change in the Twitter API (v2), I can see that right now you cannot view tweets without logging in.

TheMulti0 avatar Aug 18 '20 08:08 TheMulti0

Ok, so gallery-dl has twitter extractor that uses twitter api (v2). It does work. However, it has rate-limit. Also one of the tweets I used for testing didn't have its media returned, even though it can be viewed in a browser.

brachna avatar Aug 19 '20 06:08 brachna

Confirming that this error is still an issue, same code and traceback as OP comment.

kdipippo avatar Oct 26 '20 03:10 kdipippo

Any updates about this issue?

LucasGobatto avatar Apr 19 '21 01:04 LucasGobatto