takahe
takahe copied to clipboard
add support for enabling Mastodon 4.2 search indexing
adds the opt-in attribute which enables Mastodon 4.2 to index toots from an account
I was considering that, too, but it seemed to me that it was conceptually a little different. search_enabled is a feature flag for (a partially implemented?) local search of local accounts, while indexable is an AP Actor level flag for opt-in to being indexed on remote servers.
Was this the other way around, ie indexable had come first, then it would be obvious to implement search_enabled on top of that.
Heads up though! The migration appears to have created this as a non-nullable column, and I missed at least one code path which leaves the attribute null during fetch/create. Will review.
Hmm, my lack of experience with Django shows again. As far as I can see, indexable is defined to default to false everywhere, but somehow it's still passed as null into a database insert here.. https://github.com/osmaa/takahe/blob/883c607468252fdfbf107cfe5c35ca86d6afc70c/users/models/identity.py#L452
I agree they are different meanings conceptually, but I still would like to combine the meanings now - two search options just seems too many, and I don't see a lot of cases where people would enable it locally but not remotely and vice-versa. It's a little bit of an expectation-breaking change, but I am alright with it in this instance.
Fully agree that the privacy-related settings in Mastodon are too many. I've been meaning to outline a matrix of all of the possible combinations to see which of them even make sense. I don't know what to make of the existence of these technically valid combos, for example:
discoverable=false, indexable=true, toot=public (it's not listed on Mastodon's local timeline, but can be found by text search) discoverable=true, indexable=false, toot=public (is listed, but not search indexed) discoverable=true, indexable=true, toot=unlisted (not listed nor searchable) discoverable=true, indexable=true, noindex=true (opted in to be indexed by everyone but web search engines) discoverable=false, indexable=false, noindex=false (opted out of being found on Mastodon, while allowing web search crawlers)
It's a mess. Is it a mess that can be cleaned up? If it was just me, I'd just merge all three account level settings to one (values: promote, search, unlisted), and disallow use of "public" toot level on unlisted accounts.
Right, it being a bit of a mess was kind of the thing I wanted to avoid. I do think that in Takahē's case, with just two options - "discoverable" and "search_enabled" - we end up with only three sensible configurations:
- Discoverable and searchable: Where most people probably end up
- Discoverable but not searchable: Maybe you're trying to avoid harassment enabled via search
- Not discoverable but searchable: Should not be allowed, makes no sense
- Not discoverable or searchable: Traditional privacy-focused stance
I'm not sure how sensible it might be to make the UI switch search off if you flip discoverable off, but it feels like it should.
I would argue that:
Discoverable but not searchable: Maybe you're trying to avoid harassment enabled via search
is superfluous and should be instead delivered by automatic pruning of old toots from both timelines and search indices. "Allow my toots to be discovered but only for X days/weeks".
While your:
Not discoverable but searchable: Should not be allowed, makes no sense
That would be someone who opts in to be found by explicit search, but wouldn't want to be shown in trending lists or being algorithmically promoted.
I didn't even include that Mastodon further complicates this by having different logic for hashtags. Again, if it was just me, I'd say that hashtags should be restricted to public toots only. Yes, there are nuances like being generally unsearchable but opening tiny windows into discovery on very specific topics only, but the complexities around documenting that kind of behavior make it into a trap.
So the question really is, how much does it make sense to try to do things different to Mastodon, which has evolved to a weird legacy of incompatible layers, but is the dominant source and consumer of ActivityPub content. Plus, if you still also have plans of also exploring AT proto PDS functionality, that'll map different. Mostly just 100% public with no control over third party indexing, though..
Well, automatic pruning of local things from searches would be nice, but that's a separate feature so I'm not going to say we should do that now.
In general I want to keep Takahē relatively low on options and complexity - so I think just tying Mastodon's indexable property to "search enabled" and changing its help text to say that it enables you to be searched locally and remotely would be the way to go here.
this seems quite important feature for users like mine, regardless of separate option or not. what's best next step to get it merged?
I'm willing to accept a PR for this that just does this flag based off of our existing search_enabled
and discoverable
flags, where you get marked as having search indexing allowed if they're both true.
does this flag based off of our existing search_enabled and discoverable flags
There's no perfect solution and I can totally live with this.
how much does it make sense to try to do things different to Mastodon
@osmaa I agree with you this is real concern if Mastodon exposes these searchable options separately via API, but right now they are only changeable in UI I guess, so I'm ok with Andrew's suggestion above.
@osmaa @AstraLuma this is absolutely great feature. any chance get this updated / merged? happy to do anything I can to help.