mbin icon indicating copy to clipboard operation
mbin copied to clipboard

Rework the tag system

Open BentiGorlich opened this issue 10 months ago • 32 comments

The most integral thing I've done here is to rework the db schema for hashtags. We now have a table called hashtag which contains the string tag it self as well as other info (atm only banned) and hashtag_link which links posts of any kind to hashtags if they contain them. The migration I wrote contains sql code to transition between the new and old schema.

Changing this causes a lot of necessary adjustments. ~~I have not tested the API methods, because I don't know how :sweat_smile: That should definitely be done.~~

The reason I did it is that the hashtag queries are one of the slowest queries we have at the moment and it is not possible to reasonably implement banning, blocking or subscribing to hashtags with the old system.

One cool thing I did in the process is to write the NativeQueryAdapter for pagination. It allows to put any SQL query in there and get a pagination interface for it. I replaces some of the queries in the SearchRepository with it.

Visible things I changed:

1. [new] Hashtag panel:

grafik

2. [new] Hashtag Admin Panel Options:

There is now a button to ban/unban a hashtag (see screenshot above). The ban button is a danger button, the unban button is not.

3. [removed] Related Hashtags:

I decided to remove the hard coded related hashtags, but keep the box they were in (see screenshot above). I tried removing the box as well, but it just looked weird. We can add the usual filter options to it in the future

4. [removed] "tags" Input:

After Before
grafik grafik

This can be reverted, but the input always just confused me anyways, so removed it for ease of use.

Behaviour of banned hashtags:

I went for a more temporary approach of the ban: keep posts that already exists, but block new posts from being created.

Local

If a user tries to create a post with a banned hashtag in them they see this error message: grafik

I did not add an error message that specifically said "Tag x is banned" so the user can't circumvent the ban that easily. If this doesn't make any sense to anyone we can of course change it.

Remote

When a post is created with a banned hashtag in the body a TagBannedException gets thrown. This is caught in the CreateHandler and the ChainActivityHandler and the post is discarded (these are the only two points where posts get created that I could find). So when we get a remote post which contains a banned hashtag we do not create it at all.

visibility

When a post contains a banned hashtag it is hidden in most places: search, timeline (thread and microblog). The one thing where they still show up is on the hashtag page itself. My idea was that the admin should still be able to see all the posts that acutally exists containing the hashtag. I considered making the hashtag side admin only if a hashtag is banned, but I didn't do it, yet. I am counting on your feedback, how we should handle it.

BentiGorlich avatar Apr 01 '24 19:04 BentiGorlich

I have it running on <gehirneimer.de> if anyone wants to take a peek at that :)

BentiGorlich avatar Apr 01 '24 20:04 BentiGorlich

I marked is as ready again, because the the microblog pages being slow has bothing to do with this PR, just with my server needing more RAM 😅

BentiGorlich avatar Apr 02 '24 13:04 BentiGorlich

I marked is as ready again, because the the microblog pages being slow has bothing to do with this PR, just with my server needing more RAM 😅

Please don't cause more regression. We still try to fix the regression issues from the delete pr. 😔

melroy89 avatar Apr 02 '24 14:04 melroy89

I marked is as ready again, because the the microblog pages being slow has bothing to do with this PR, just with my server needing more RAM 😅

Please don't cause more regression. We still try to fix the regression issues from the delete pr. 😔

...I hope you're not being serious, I highly doubt @BentiGorlich is intentionally causing regression.

ghost avatar Apr 02 '24 14:04 ghost

I'm actually pretty serious here. Yes I know he doesn't do it on purpose. Nobody does.

But I would first fix the regression issues before adding more features or refactoring like this PR. There need to be focus on stability and regression fixes first, before continuing with more.

melroy89 avatar Apr 02 '24 15:04 melroy89

I'm gonna respond in a longer post later, because you're kindakilling motivation with rhis kind of statement, but for now: you guys already fixed the regressions we know of and the problems occurred while I was working on this one, so I don't get where your attitude is coming from...

BentiGorlich avatar Apr 02 '24 15:04 BentiGorlich

, so I don't get where your attitude is coming from...

I didn't sleep well. But it was also still on my mind.

melroy89 avatar Apr 02 '24 16:04 melroy89

@BentiGorlich @nobodyatroot again sorry how I reacted again. I have some bad time recently. I already talked to Benti about it. Hopefully it will all settle again. For now I have too much Tinnitus, so I take some time off.

melroy89 avatar Apr 03 '24 08:04 melroy89

@melroy89 Thanks for apologizing publicly, I mean it :)

BentiGorlich avatar Apr 03 '24 11:04 BentiGorlich

It seems the API doesn't like the change to createResponseDto in src/Factory/EntryFactory.php:

public function createResponseDto(EntryDto|Entry $entry, array $tags): EntryResponseDto

Missing tags being passed in L54 src/Controller/Api/Notification/NotificationBaseApi.php

$toReturn['subject'] = $this->entryFactory->createResponseDto($entry);

Same thing in L26 src/Controller/Api/Post/PostsBaseApi.php

$response = $this->postFactory->createResponseDto($dto);

createResponseDto needs tags in src/Factory/PostFactory.php:

public function createResponseDto(PostDto|Post $post, array $tags): PostResponseDto

Both will obviously generate 500s for anyone trying to use the API as I see in my logs:

{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:22.433993+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:23.951949+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:24.898869+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:26.580721+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:27.864327+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:44.736181+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:46.084703+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:47.256111+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected\" at EntryFactory.php line 41","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\EntryFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Notification/NotificationBaseApi.php on line 54 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/EntryFactory.php:41"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:52.202445+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\PostFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Post/PostsBaseApi.php on line 26 and exactly 2 expected\" at PostFactory.php line 34","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\PostFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Post/PostsBaseApi.php on line 26 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/PostFactory.php:34"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:55.473759+00:00","extra":{}}
{"message":"Uncaught PHP Exception ArgumentCountError: \"Too few arguments to function App\\Factory\\PostFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Post/PostsBaseApi.php on line 26 and exactly 2 expected\" at PostFactory.php line 34","context":{"exception":{"class":"ArgumentCountError","message":"Too few arguments to function App\\Factory\\PostFactory::createResponseDto(), 1 passed in /var/www/kbin/src/Controller/Api/Post/PostsBaseApi.php on line 26 and exactly 2 expected","code":0,"file":"/var/www/kbin/src/Factory/PostFactory.php:34"}},"level":500,"level_name":"CRITICAL","channel":"request","datetime":"2024-04-03T14:38:56.848078+00:00","extra":{}}

ghost avatar Apr 03 '24 18:04 ghost

Thanks for testing. How do I test it?

BentiGorlich avatar Apr 03 '24 18:04 BentiGorlich

Thanks for testing. How do I test it?

Someone is actually using the API on my site, lol. Maybe point the interstellar app at your instance and try to monkey with the site functions? Kind of a brute force testing solution until we get proper API unit tests in place.

ghost avatar Apr 03 '24 18:04 ghost

I'm getting a 500 when accessing a route for a tag that isn't in the DB, I believe e.g. /tag/reallysurprisedifthistagisinyourdb

The 500 happens in dev mode, in prod the sidebar just doesn't render

Impossible to access a key ("entry") on a null variable.

templates/tag/_panel.html.twig:27
{{ _self.meta_item('threads'|trans, counts["entry"]) }}

e-five256 avatar Apr 07 '24 23:04 e-five256

I'm not sure if I misunderstand where the data is migrating from, but my post tags don't appear to have migrated

image

# select body from post where id = 62;
       body        
-------------------
 new post #testing

# select * from hashtag;
 id |   tag   | banned 
----+---------+--------
  1 | testtag | f
(1 row)

In hindsight I should've taken a backup before migrating as I can't revert to see how it was populated before

e-five256 avatar Apr 07 '24 23:04 e-five256

I tried to repro by migrating down -> making a new post with a hashtag -> migrating back up, and couldn't. So it's odd that I have a post not linked to a hashtag that contains one but I can't say for certain what happened

attempt two:

checkout main

# select id, tags from post where tags is not null;
 id |    tags     
----+-------------
 65 | ["hashtag"]
(1 row)

checkout this PR, migrate

# select * from hashtag;
 id |   tag   | banned 
----+---------+--------
  1 | testtag | f
  2 | hashtag | f
(2 rows)

e-five256 avatar Apr 07 '24 23:04 e-five256

I'm getting a 500 when accessing a route for a tag that isn't in the DB, I believe e.g. /tag/reallysurprisedifthistagisinyourdb

The 500 happens in dev mode, in prod the sidebar just doesn't render

Impossible to access a key ("entry") on a null variable.

templates/tag/_panel.html.twig:27
{{ _self.meta_item('threads'|trans, counts["entry"]) }}

Are you sure it is not just a cached component? Thats what happened to me in dev mode

BentiGorlich avatar Apr 08 '24 06:04 BentiGorlich

I tried to repro by migrating down -> making a new post with a hashtag -> migrating back up, and couldn't. So it's odd that I have a post not linked to a hashtag that contains one but I can't say for certain what happened

attempt two:

checkout main

# select id, tags from post where tags is not null;
 id |    tags     
----+-------------
 65 | ["hashtag"]
(1 row)

checkout this PR, migrate

# select * from hashtag;
 id |   tag   | banned 
----+---------+--------
  1 | testtag | f
  2 | hashtag | f
(2 rows)

I mean you are missing the hash taglink table, so I can't say if this one isn't linked...

BentiGorlich avatar Apr 08 '24 06:04 BentiGorlich

Are you sure it is not just a cached component? Thats what happened to me in dev mode

I just flushed php and redis cache and it still happens. I'm fairly confident because it's clearly happening on kbin.run, the sidebar isn't rendering the tag components on pages for non-existing tags

I did get confused, it turns out there's another one. When you're an admin, non-existing tag pages throw a different 500:

An exception has been thrown during the rendering of a template ("").

templates/tag/overview.html.twig:18
<form action="{{ path('tag_' ~ (is_tag_banned(tag) ? 'unban' : 'ban'), {name: tag}) }}" name="tag_ban" method="post"

e-five256 avatar Apr 08 '24 12:04 e-five256

Ok a big big misunderstanding. I though that you meant that the migration didn't work and that the component therefore throws the error... I am very very confident that the migrations work as intended, up and down, because I did it multiple times in my dev env and every tagged entry, post, etc. keeps on working.

BentiGorlich avatar Apr 08 '24 12:04 BentiGorlich

It doesn't 500 in APP_ENV=prod, it just doesn't render, look at how there are no numbers for threads/comments/posts/replies. Not sure if you're an admin as well, I imagine if you were an admin the ban/unban button would fail to render as the 2nd 500 I posted was for rendering the ban form

e-five256 avatar Apr 08 '24 12:04 e-five256

Yeah removed my comment because I was not logged in as an admin

BentiGorlich avatar Apr 08 '24 12:04 BentiGorlich

I did not add an error message that specifically said "Tag x is banned" so the user can't circumvent the ban that easily. If this doesn't make any sense to anyone we can of course change it.

I personally don't like the non-descriptive error. Most likely, the user attempting to create the post will ask in a help thread, which may increase the support burden on admins/moderators. If the user does get an answer to their question, they then know that a tag they used is banned and can circumvent the ban all the same.

Generally speaking, I am no fan of hiding potentially crucial information from a user. This is the same with federated/blocked instances. Knowing what type of content you can and can't engage with is about the most important factor when choosing an instance to create an account on and not exposing that crucial information to the user is fatigueing.

I for one joined an Mbin instance, precisely because /federation (also link /instances to /federation for parity with Lemmy pls, k thx bai) shows me what other instances are blocked. In that sense, I'd like to argue to go even a step further and show blocked tags under /federation as well.

UmBottesWillen avatar Apr 09 '24 12:04 UmBottesWillen

(also link /instances to /federation for parity with Lemmy pls, k thx bai)

that was done https://github.com/MbinOrg/mbin/pull/394

e-five256 avatar Apr 09 '24 13:04 e-five256

(also link /instances to /federation for parity with Lemmy pls, k thx bai)

that was done #394

Ah, very nice, I didn't see that change. Much appreciated.

UmBottesWillen avatar Apr 09 '24 13:04 UmBottesWillen

I personally don't like the non-descriptive error. Most likely, the user attempting to create the post will ask in a help thread, which may increase the support burden on admins/moderators. If the user does get an answer to their question, they then know that a tag they used is banned and can circumvent the ban all the same.

I can see that, that's why I said that we could change it. Maybe the most important thing is just rejecting incoming activities with the hashtag and local users violating rules will be dealt with differently...

As for the other point, I opened a discussion issue, because I think this will kinda explode this PR: #697

BentiGorlich avatar Apr 09 '24 15:04 BentiGorlich

@BentiGorlich I know I am not the one making decisions, because I'm just a user, but I hope "tag input field" will not get removed, I actually like it and how it natively encourages users to use tags

Removing it will massively disencourage people whom using hashtags, because the use of hashtags in body of thread type posts looks counterintuitive and awkward(instead of a separate field)

Removing it will hurt discoverability of a content, something that is already a problem on Fedi, because people will just stop using tags in this case, and they have a lot of potential for further uses

FitikWasTaken avatar Apr 12 '24 19:04 FitikWasTaken

I hope "tag input field" will not get removed, I actually like it and how it natively encourages users to use tags

Removing it will massively disencourage people whom using hashtags, because the use of hashtags in body of thread type posts looks counterintuitive and awkward(instead of a separate field)

I think you might be right. I actually don't know how many people use this input, though...

Removing it will hurt discoverability of a content, something that is already a problem on Fedi, because people will just stop using tags in this case, and they have a lot of potential for further uses

Yes it hurts potential discoverability on mastodon (Mbin has a problem, see below). You could of course just put the hashtag in the text, like they do on mastodon, which is just not that common on the threadiverse...

The mentioned Mbin problem: we do not use the "tag" property of the activity pub representation of posts. We only extract hashtags and mentions from the body which is a problem in #538 as well. We definitely need to work on that.

@e-five256 @nobodyatroot @asdfzdfj what do you think about it?

BentiGorlich avatar Apr 15 '24 14:04 BentiGorlich

I'm not exactly sure the purpose of tags in entries, but I would think we would want to go the other way, that is removing the tag parsing from the body element and only using the AP fields.

It seems the field when posting a new thread was used, this is what me posting to my /m/testing magazine looked like with a tag field value of tag1:

  "tag": [
    {
      "type": "Hashtag",
      "href": "http://127.0.0.1:8666/tag/tag1",
      "name": "#tag1"
    },
    {
      "type": "Hashtag",
      "href": "http://127.0.0.1:8666/tag/testing",
      "name": "#testing"
    }
  ]

That seems much better than body tags imo. Whenever I see tags in the wild, it almost always is a false positive of someone like trying to say "number", like #1, or anything else that it could be used for that isn't tags, as people really don't seem to be using body content for tags on Mbin / lemmy when it comes to Page types.


I guess I would like to see it added back as well, unless I'm misunderstanding an issue with the new db schema, as it seemed to work fine, and the area the tags were presented at is now never able to be populated

image

e-five256 avatar Apr 16 '24 01:04 e-five256

my quick 2c:

  • for remote objects: use and save the incoming tags (and mentions), though this does assume that the remote software generate this information correctly
  • the tags/mentions extraction from body would still needs to be fixed for processing local objects, and for federation with everything else (also see above point)
  • the "tags" field could stay, I think, maybe slightly reword a bit to make it more clear that this maps to hashtags?
  • still not sure if the entries should have their tags from body merged in here too, like having a dedicated tags input is miles better than piling a bunch of hashtags at the bottom of an entry, but then you also have people who might tag stuff as part of the written content e.g. just got fresh #apples and #oranges where they may be some mismatched expectations and discovery problem if tags from body are not included

side tangent:

Whenever I see tags in the wild, it almost always is a false positive of someone like trying to say "number", like #1, or anything else that it could be used for that isn't tags

for the case of number only hastags, I think it's possible to modify hashtag parsing to drop number only tags, it it's anything, misskey parser did exactly this, akkoma seems to also do this https://github.com/misskey-dev/mfm.js/blob/6aaf68089023c6adebe44123eebbc4dcd75955e0/src/internal/parser.ts#L617-L620

asdfzdfj avatar Apr 16 '24 06:04 asdfzdfj

For me to do in the scope of this PR:

  • [x] Fix the "Tags" pane
  • [x] Bring back the tags input field when creating anything (and maybe rename it to "Hashtags"?)
  • [x] Figure out what to do with the error message when creating an entry

To do in a later issue:

  • [ ] modify the markdown converter so only hashtags and mentions are used that are actually linked in the activity object
  • [ ] modify out hashtag extraction for local object to not include number only hashtags
  • [ ] make an admin page that shows all banned hashtags

BentiGorlich avatar Apr 19 '24 12:04 BentiGorlich