twitter-to-sqlite icon indicating copy to clipboard operation
twitter-to-sqlite copied to clipboard

Archive import appears to be broken on recent exports

Open jacobian opened this issue 4 years ago • 5 comments

I requested a Twitter export yesterday, and unfortunately they seem to have changed it such that twitter-to-sqlite import can't handle it anymore 😢

So far I've ran into two issues. The first was easy to work around, but the second will take more investigation. If I can find the time I'll keep working on it and update this issue accordingly.

The issues (so far):

1. Data seems to have moved to a data/ subdirectory

Running twitter-to-sqlite import on the raw zip file reports a bunch of "not yet implemented" errors, and then exits without actually importing anything:

❯ twitter-to-sqlite import tarchive.db twitter.zip
...
data/manifest: not yet implemented
data/account-creation-ip: not yet implemented
data/account-suspension: not yet implemented
... (dozens of more lines like this, including critical stuff like data/tweets) ...

(tarchive.db now exists, but is empty)

Workaround: unpack the zip file, and run twitter-to-sqlite import tarchive.db path/to/archive/data

That gets further, but:

2. Some schema(s?) have changed

At least, the blocks schema seems different now:

❯ twitter-to-sqlite import tarchive.db archive/data
direct-messages-group: not yet implemented
branch-links: not yet implemented
periscope-expired-broadcasts: not yet implemented
direct-messages: not yet implemented
mute: not yet implemented
Traceback (most recent call last):
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/bin/twitter-to-sqlite", line 8, in <module>
    sys.exit(cli())
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/cli.py", line 772, in import_
    archive.import_from_file(db, filepath.name, open(filepath, "rb").read())
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/archive.py", line 215, in import_from_file
    to_insert = transformer(data)
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/archive.py", line 115, in lists_member
    return {"lists-member": _list_from_common(data)}
  File "/Users/jacob/Library/Caches/pypoetry/virtualenvs/jacobian-dogsheep-4AXaN4tu-py3.8/lib/python3.8/site-packages/twitter_to_sqlite/archive.py", line 200, in _list_from_common
    for url in block["userListInfo"]["urls"]:
KeyError: 'urls'

That's as far as I got before I needed to work on something else. I'll report back if I get further!

jacobian avatar Jan 05 '21 14:01 jacobian

Correction: the failure is on lists-member.js (I was thrown by the block variable name, but that's just a coincidence)

jacobian avatar Jan 05 '21 15:01 jacobian

I was able to fix this, at least enough to get my archive to import. Not sure if there's more work to be done here or not.

jacobian avatar Jan 05 '21 16:01 jacobian

My import got much further with the applied fixes than 0.21.3, but not 100%. I do appear to have all of the tweets imported at least. Not sure when I'll have a chance to look further to try to fix or see what didn't make it into the import.

Here's my output:

direct-messages-group: not yet implemented
branch-links: not yet implemented
periscope-expired-broadcasts: not yet implemented
direct-messages: not yet implemented
mute: not yet implemented
periscope-comments-made-by-user: not yet implemented
periscope-ban-information: not yet implemented
periscope-profile-description: not yet implemented
screen-name-change: not yet implemented
manifest: not yet implemented
fleet: not yet implemented
user-link-clicks: not yet implemented
periscope-broadcast-metadata: not yet implemented
contact: not yet implemented
fleet-mute: not yet implemented
device-token: not yet implemented
protected-history: not yet implemented
direct-message-mute: not yet implemented
Traceback (most recent call last):
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/bin/twitter-to-sqlite", line 33, in <module>
    sys.exit(load_entry_point('twitter-to-sqlite==0.21.3', 'console_scripts', 'twitter-to-sqlite')())
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/twitter_to_sqlite/cli.py", line 772, in import_
    archive.import_from_file(db, filepath.name, open(filepath, "rb").read())
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/twitter_to_sqlite/archive.py", line 233, in import_from_file
    to_insert = transformer(data)
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/twitter_to_sqlite/archive.py", line 21, in callback
    return {filename: [fn(item) for item in data]}
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/twitter_to_sqlite/archive.py", line 21, in <listcomp>
    return {filename: [fn(item) for item in data]}
  File "/Users/henry/.local/share/virtualenvs/python-sqlite-testing-mF3G2xKl/lib/python3.9/site-packages/twitter_to_sqlite/archive.py", line 88, in ageinfo
    return item["ageMeta"]["ageInfo"]
KeyError: 'ageInfo'

henry501 avatar Jan 26 '21 23:01 henry501

Similar trouble with ageinfo using 0.22. Here's what my ageinfo.js file looks like:

window.YTD.ageinfo.part0 = [
  {
    "ageMeta" : { }
  }
]

Commenting out the registration for ageinfo in archive.py gets my archive to import.

danp avatar Sep 26 '21 14:09 danp

as of 2023 it appears that tweets: not yet implemented is happening.. pretty important for a twitter export functionality!

this seems an impossible task with twitter liable to switch this around every other day

swyxio avatar Jan 04 '23 11:01 swyxio