snscrape icon indicating copy to clipboard operation
snscrape copied to clipboard

`KeyError: 'timeline'` crash on `twitter-profile` scraper

Open frankiec opened this issue 1 year ago • 11 comments

Describe the bug

when I run:

/bin/snscrape --jsonl --max-results 10 twitter-profile someone

it throws error:

Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/snscrape", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/.local/lib/python3.10/site-packages/snscrape/_cli.py", line 323, in main
    for i, item in enumerate(scraper.get_items(), start = 1):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 1830, in get_items
    instructions = obj['data']['user']['result']['timeline_v2']['timeline']['instructions']
KeyError: 'timeline'

How to reproduce

/bin/snscrape --jsonl --max-results 10 twitter-profile someone

Expected behaviour

It was working until Thu Jun 29 21:35:07 UTC 2023.

Screenshots and recordings

No response

Operating system

ubuntu 16

Python version: output of python3 --version

Python 3.10.6

snscrape version: output of snscrape --version

snscrape 0.6.2.20230321.dev32+gb76f485

Scraper

witter-profile

How are you using snscrape?

CLI (snscrape ... as a command, e.g. in a terminal)

Backtrace

No response

Log output

No response

Dump of locals

No response

Additional context

No response

frankiec avatar Jun 30 '23 00:06 frankiec

I have this error too

File "/home/.venv/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 1902, in get_items instructions = obj['data']['user']['result']['timeline_v2']['timeline']['instructions'] KeyError: 'timeline'

version: v0.7.0.20230622

0bmay avatar Jun 30 '23 01:06 0bmay

I'm seeing the same thing on the web interface: the 'Replies' tab on profile pages is empty. This looks like a bug on Twitter's side.

JustAnotherArchivist avatar Jun 30 '23 01:06 JustAnotherArchivist

looks like the UserTweets api call works and has data returned, but UserTweetsAndReplies is busted on Twitters end.. call returns 200, but has no data in timeline_v2

0bmay avatar Jun 30 '23 03:06 0bmay

I am experiencing the timeline key error as well. @0bmay would you mind sharing how to implement the UserTweets api call, I am having difficulty figuring that out.

locfinessemonster avatar Jun 30 '23 04:06 locfinessemonster

in TwitterProfileScrapper I added a second get_items call, get_items2, and I use that to get the profile tweets.. no replies, but something is better than nothing. Most of the code is the same as get_items, I just changed the features and variables and added the field_options that the calls use on the frontend of the site.

	def get_items2(self):
		if not self._isUserId:
			if self.entity is None:
				raise snscrape.base.ScraperException(f'Could not resolve username {self._user!r} to ID')
			userId = self.entity.id
		else:
			userId = self._user

		paginationVariables = {
			'userId': userId,
			'count': 100,
			'cursor': None,
			'includePromotedContent': True,
			'withQuickPromoteEligibilityTweetFields': True,
			'withVoice': True,
			'withV2Timeline': True,
		}
		variables = paginationVariables.copy()
		del variables['cursor']
		features = {
			'rweb_lists_timeline_redesign_enabled': False,
			'responsive_web_graphql_exclude_directive_enabled': True,
			'verified_phone_label_enabled': False,
			'creator_subscriptions_tweet_preview_api_enabled': False,
			'responsive_web_graphql_timeline_navigation_enabled': True,
			'responsive_web_graphql_skip_user_profile_image_extensions_enabled': False,
			'tweetypie_unmention_optimization_enabled': True,
			'responsive_web_edit_tweet_api_enabled': True,
			'graphql_is_translatable_rweb_tweet_is_translatable_enabled': True,
			'view_counts_everywhere_api_enabled': True,
			'longform_notetweets_consumption_enabled': True,
			'responsive_web_twitter_article_tweet_consumption_enabled': False,
			'tweet_awards_web_tipping_enabled': False,
			'freedom_of_speech_not_reach_fetch_enabled': True,
			'standardized_nudges_misinfo': True,
			'tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled': False,
			'longform_notetweets_rich_text_read_enabled': True,
			'longform_notetweets_inline_media_enabled': False,
			'responsive_web_enhance_cards_enabled': False
		}
		field_toggles = {"withArticleRichContentState": False}
		params = {'variables': variables, 'features': features}
		paginationParams = {'variables': paginationVariables, 'features': features, 'fieldToggles': field_toggles}

		gotPinned = False
		previousPagesTweetIds = set()
		for obj in self._iter_api_data('https://twitter.com/i/api/graphql/sPOiMsDrOtmxC00E01DkTA/UserTweets', _TwitterAPIType.GRAPHQL, params, paginationParams, instructionsPath = ['data', 'user', 'result', 'timeline_v2', 'timeline', 'instructions']):
			if not obj['data'] or 'result' not in obj['data']['user']:
				raise snscrape.base.ScraperException('Empty response')
			if obj['data']['user']['result']['__typename'] == 'UserUnavailable':
				raise snscrape.base.EntityUnavailable('User unavailable')
			instructions = obj['data']['user']['result']['timeline_v2']['timeline']['instructions']
			if not gotPinned:
				for instruction in instructions:
					if instruction['type'] == 'TimelinePinEntry':
						gotPinned = True
						tweetId = int(instruction['entry']['entryId'][6:]) if instruction['entry']['entryId'].startswith('tweet-') else None
						yield self._graphql_timeline_tweet_item_result_to_tweet(instruction['entry']['content']['itemContent']['tweet_results']['result'], tweetId = tweetId, pinned = True)
			tweets = list(self._graphql_timeline_instructions_to_tweets(instructions, pinned = False))
			pageTweetIds = frozenset(tweet.id for tweet in tweets)
			if len(pageTweetIds) > 0 and pageTweetIds in previousPagesTweetIds:
				_logger.warning("Found duplicate page of tweets, stopping as assumed cycle found in Twitter's pagination")
				break
			previousPagesTweetIds.add(pageTweetIds)
			# Includes tweets by other users on conversations, don't return those
			for tweet in tweets:
				if getattr(getattr(tweet, 'user', None), 'id', userId) != userId:
					continue
				yield tweet

0bmay avatar Jun 30 '23 05:06 0bmay

Getting the same err

Pratham-19 avatar Jun 30 '23 12:06 Pratham-19

@0bmay thank you for sharing I really appreciate it

locfinessemonster avatar Jun 30 '23 13:06 locfinessemonster

Twitter has blocked every unregistered user from viewing tweets. Is this related?

jerrycool123 avatar Jun 30 '23 15:06 jerrycool123

Twitter has blocked every unregistered user from viewing tweets. Is this related?

not related.. the UserTweets api endpoint is still returning data... UserTweetsAndReplies is still b0rk3d.

0bmay avatar Jun 30 '23 15:06 0bmay

Assuming there is still an issue with UserTweetsAndReplies API endpoint? I'm still getting KeyError: 'timeline'

zack-sims413 avatar Jun 30 '23 18:06 zack-sims413

@JustAnotherArchivist is there any update implemented the proposed solution to fix the error at least for the tweets?

Akhorramrouz avatar Jul 01 '23 22:07 Akhorramrouz