mastodon-archive
mastodon-archive copied to clipboard
TypeError: 'NoneType' object is not subscriptable
With version 1.4.2 installed from pip and mamot.fr running on Mastodon v4.1.3, most commands work but some fail with TypeError like:
$ mastodon-archive context [email protected] https://linuxrocks.online/users/Linux/statuses/813475
Indexing 5487 statuses...
Indexing 5023 favourites...
Indexing 158 bookmarks...
Indexing 2791 mentions...
Traceback (most recent call last):
File "/home/federico/.local/bin/mastodon-archive", line 33, in <module>
sys.exit(load_entry_point('mastodon-archive==1.4.2', 'console_scripts', 'mastodon-archive')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/federico/.local/lib/python3.11/site-packages/mastodon_archive/__init__.py", line 333, in main
args.command(args)
File "/home/federico/.local/lib/python3.11/site-packages/mastodon_archive/context.py", line 48, in context
if status["reblog"] is not None:
~~~~~~^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable
Strange! Do you have an idea of how this can happen? An empty status somehow?
Il 13/07/23 13:33, Alex Schroeder ha scritto:
Strange! Do you have an idea of how this can happen? An empty status somehow?
I don't know, but I suspect my instance has enabled a data retention policy and that for this reason some fields are empty (for example replies no longer mention their parent post). That "feature" is very brutal: https://github.com/mastodon/mastodon/pull/19232
I guess that means double-checking for the existence of all the attributes the code simply assumes to be there. How annoying. Ugh.
@nemobis Want to try something like this? It's untested on my end, but since you have a setup that is probably full of random holes, perhaps we can uncover and handle all the common cases.
index 16ff94a..a34fef7 100644
--- a/mastodon_archive/context.py
+++ b/mastodon_archive/context.py
@@ -46,6 +46,9 @@ def context(args):
print("Indexing %d %s..." % (len(statuses), collection))
for status in statuses:
+ if status is None:
+ continue;
+
if status["reblog"] is not None:
status = status["reblog"]
To be honest, I'm not quite sure I totally understand the problem. The feature is this:
Content (statuses) or media (media attachments and preview cards) that are older than the retention period will be automatically removed.
But you're running code that wants to show the context of a toot, based on the archive you already downloaded. So the archive contains statuses that are "nothing"? I don't even know how that would work. How does the JSON file represent this? Thus, an alternative would be this: find instances of null,
in the JSON file, I guess? Does that really exist? I feel like perhaps these missing toots shouldn't even end up as null entries in the archive.
I also have this error when I use --with-mentions
.
In the JSON file, the mentions
array contains a lot of null
s.
Inspired by your previous answer, I changed archive.py
at line 72, from:
seen = { str(status["id"]): status for status in statuses}
to:
seen = { str(status["id"]): status for status in statuses if status is not None}
But I think it might be better to remove them from statuses
earlier.
Hm. I guess we need to decide what to do about the nulls. Keep them in the archive? In that case we need to handle all the uses of the statuses (like your suggestion). Or remove them from the archive? And prevent them from getting added? Perhaps that brings with it more problems. Gaaaah.
I started making a lot of changes for all instances of in statuses
in the code and then I was unsure of what was happening. So then I decided to add just the one change suggested by @floriancargoet: 9a181f6