snscrape
snscrape copied to clipboard
`KeyError: 'timeline'` crash on `twitter-profile` scraper
Describe the bug
when I run:
/bin/snscrape --jsonl --max-results 10 twitter-profile someone
it throws error:
Traceback (most recent call last):
File "/home/ubuntu/.local/bin/snscrape", line 8, in <module>
sys.exit(main())
File "/home/ubuntu/.local/lib/python3.10/site-packages/snscrape/_cli.py", line 323, in main
for i, item in enumerate(scraper.get_items(), start = 1):
File "/home/ubuntu/.local/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 1830, in get_items
instructions = obj['data']['user']['result']['timeline_v2']['timeline']['instructions']
KeyError: 'timeline'
How to reproduce
/bin/snscrape --jsonl --max-results 10 twitter-profile someone
Expected behaviour
It was working until Thu Jun 29 21:35:07 UTC 2023.
Screenshots and recordings
No response
Operating system
ubuntu 16
Python version: output of python3 --version
Python 3.10.6
snscrape version: output of snscrape --version
snscrape 0.6.2.20230321.dev32+gb76f485
Scraper
witter-profile
How are you using snscrape?
CLI (snscrape ...
as a command, e.g. in a terminal)
Backtrace
No response
Log output
No response
Dump of locals
No response
Additional context
No response
I have this error too
File "/home/.venv/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 1902, in get_items instructions = obj['data']['user']['result']['timeline_v2']['timeline']['instructions'] KeyError: 'timeline'
version: v0.7.0.20230622
I'm seeing the same thing on the web interface: the 'Replies' tab on profile pages is empty. This looks like a bug on Twitter's side.
looks like the UserTweets api call works and has data returned, but UserTweetsAndReplies is busted on Twitters end.. call returns 200, but has no data in timeline_v2
I am experiencing the timeline key error as well. @0bmay would you mind sharing how to implement the UserTweets api call, I am having difficulty figuring that out.
in TwitterProfileScrapper I added a second get_items call, get_items2, and I use that to get the profile tweets.. no replies, but something is better than nothing. Most of the code is the same as get_items, I just changed the features and variables and added the field_options that the calls use on the frontend of the site.
def get_items2(self):
if not self._isUserId:
if self.entity is None:
raise snscrape.base.ScraperException(f'Could not resolve username {self._user!r} to ID')
userId = self.entity.id
else:
userId = self._user
paginationVariables = {
'userId': userId,
'count': 100,
'cursor': None,
'includePromotedContent': True,
'withQuickPromoteEligibilityTweetFields': True,
'withVoice': True,
'withV2Timeline': True,
}
variables = paginationVariables.copy()
del variables['cursor']
features = {
'rweb_lists_timeline_redesign_enabled': False,
'responsive_web_graphql_exclude_directive_enabled': True,
'verified_phone_label_enabled': False,
'creator_subscriptions_tweet_preview_api_enabled': False,
'responsive_web_graphql_timeline_navigation_enabled': True,
'responsive_web_graphql_skip_user_profile_image_extensions_enabled': False,
'tweetypie_unmention_optimization_enabled': True,
'responsive_web_edit_tweet_api_enabled': True,
'graphql_is_translatable_rweb_tweet_is_translatable_enabled': True,
'view_counts_everywhere_api_enabled': True,
'longform_notetweets_consumption_enabled': True,
'responsive_web_twitter_article_tweet_consumption_enabled': False,
'tweet_awards_web_tipping_enabled': False,
'freedom_of_speech_not_reach_fetch_enabled': True,
'standardized_nudges_misinfo': True,
'tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled': False,
'longform_notetweets_rich_text_read_enabled': True,
'longform_notetweets_inline_media_enabled': False,
'responsive_web_enhance_cards_enabled': False
}
field_toggles = {"withArticleRichContentState": False}
params = {'variables': variables, 'features': features}
paginationParams = {'variables': paginationVariables, 'features': features, 'fieldToggles': field_toggles}
gotPinned = False
previousPagesTweetIds = set()
for obj in self._iter_api_data('https://twitter.com/i/api/graphql/sPOiMsDrOtmxC00E01DkTA/UserTweets', _TwitterAPIType.GRAPHQL, params, paginationParams, instructionsPath = ['data', 'user', 'result', 'timeline_v2', 'timeline', 'instructions']):
if not obj['data'] or 'result' not in obj['data']['user']:
raise snscrape.base.ScraperException('Empty response')
if obj['data']['user']['result']['__typename'] == 'UserUnavailable':
raise snscrape.base.EntityUnavailable('User unavailable')
instructions = obj['data']['user']['result']['timeline_v2']['timeline']['instructions']
if not gotPinned:
for instruction in instructions:
if instruction['type'] == 'TimelinePinEntry':
gotPinned = True
tweetId = int(instruction['entry']['entryId'][6:]) if instruction['entry']['entryId'].startswith('tweet-') else None
yield self._graphql_timeline_tweet_item_result_to_tweet(instruction['entry']['content']['itemContent']['tweet_results']['result'], tweetId = tweetId, pinned = True)
tweets = list(self._graphql_timeline_instructions_to_tweets(instructions, pinned = False))
pageTweetIds = frozenset(tweet.id for tweet in tweets)
if len(pageTweetIds) > 0 and pageTweetIds in previousPagesTweetIds:
_logger.warning("Found duplicate page of tweets, stopping as assumed cycle found in Twitter's pagination")
break
previousPagesTweetIds.add(pageTweetIds)
# Includes tweets by other users on conversations, don't return those
for tweet in tweets:
if getattr(getattr(tweet, 'user', None), 'id', userId) != userId:
continue
yield tweet
Getting the same err
@0bmay thank you for sharing I really appreciate it
Twitter has blocked every unregistered user from viewing tweets. Is this related?
Twitter has blocked every unregistered user from viewing tweets. Is this related?
not related.. the UserTweets api endpoint is still returning data... UserTweetsAndReplies is still b0rk3d.
Assuming there is still an issue with UserTweetsAndReplies API endpoint? I'm still getting KeyError: 'timeline'
@JustAnotherArchivist is there any update implemented the proposed solution to fix the error at least for the tweets?