mastodon-archive TypeError: 'NoneType' object is not subscriptable

With version 1.4.2 installed from pip and mamot.fr running on Mastodon v4.1.3, most commands work but some fail with TypeError like:

$ mastodon-archive context [email protected] https://linuxrocks.online/users/Linux/statuses/813475
Indexing 5487 statuses...
Indexing 5023 favourites...
Indexing 158 bookmarks...
Indexing 2791 mentions...
Traceback (most recent call last):
  File "/home/federico/.local/bin/mastodon-archive", line 33, in <module>
    sys.exit(load_entry_point('mastodon-archive==1.4.2', 'console_scripts', 'mastodon-archive')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/federico/.local/lib/python3.11/site-packages/mastodon_archive/__init__.py", line 333, in main
    args.command(args)
  File "/home/federico/.local/lib/python3.11/site-packages/mastodon_archive/context.py", line 48, in context
    if status["reblog"] is not None:
       ~~~~~~^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

Jul 13 '23 06:07 nemobis

Strange! Do you have an idea of how this can happen? An empty status somehow?

Jul 13 '23 10:07 kensanata

Il 13/07/23 13:33, Alex Schroeder ha scritto:

Strange! Do you have an idea of how this can happen? An empty status somehow?

I don't know, but I suspect my instance has enabled a data retention policy and that for this reason some fields are empty (for example replies no longer mention their parent post). That "feature" is very brutal: https://github.com/mastodon/mastodon/pull/19232

Jul 13 '23 13:07 nemobis

I guess that means double-checking for the existence of all the attributes the code simply assumes to be there. How annoying. Ugh.

Jul 13 '23 13:07 kensanata

@nemobis Want to try something like this? It's untested on my end, but since you have a setup that is probably full of random holes, perhaps we can uncover and handle all the common cases.

index 16ff94a..a34fef7 100644
--- a/mastodon_archive/context.py
+++ b/mastodon_archive/context.py
@@ -46,6 +46,9 @@ def context(args):
             print("Indexing %d %s..." % (len(statuses), collection))
         for status in statuses:
 
+            if status is None:
+                continue;
+
             if status["reblog"] is not None:
                 status = status["reblog"]

To be honest, I'm not quite sure I totally understand the problem. The feature is this:

Content (statuses) or media (media attachments and preview cards) that are older than the retention period will be automatically removed.

But you're running code that wants to show the context of a toot, based on the archive you already downloaded. So the archive contains statuses that are "nothing"? I don't even know how that would work. How does the JSON file represent this? Thus, an alternative would be this: find instances of null, in the JSON file, I guess? Does that really exist? I feel like perhaps these missing toots shouldn't even end up as null entries in the archive.

Jul 23 '23 11:07 kensanata

I also have this error when I use --with-mentions. In the JSON file, the mentions array contains a lot of nulls.

Nov 04 '23 13:11 floriancargoet

Inspired by your previous answer, I changed archive.py at line 72, from:

        seen = { str(status["id"]): status for status in statuses}

to:

        seen = { str(status["id"]): status for status in statuses if status is not None}

But I think it might be better to remove them from statuses earlier.

Nov 04 '23 14:11 floriancargoet

Hm. I guess we need to decide what to do about the nulls. Keep them in the archive? In that case we need to handle all the uses of the statuses (like your suggestion). Or remove them from the archive? And prevent them from getting added? Perhaps that brings with it more problems. Gaaaah.

Nov 07 '23 12:11 kensanata

I started making a lot of changes for all instances of in statuses in the code and then I was unsure of what was happening. So then I decided to add just the one change suggested by @floriancargoet: 9a181f6

Nov 07 '23 13:11 kensanata

mastodon-archive mastodon-archive copied to clipboard

TypeError: 'NoneType' object is not subscriptable

mastodon-archive
mastodon-archive copied to clipboard