Incorrect length adjustment for drive_file comment field
💡 Summary
When I lookup a post with a file that has an alt text with over 512 characters that uses special characters like combined emoji characters, Misskey truncates that text to 512 characters. However, because of the Unicode, the text is in fact longer than that when inserted into the database, causing a database error.
🥰 Expected Behavior
The text truncation should make sure that it does not exceed the length limit in Postgres.
🤬 Actual Behavior
The text truncation does not shorten the text enough.
📝 Steps to Reproduce
- Fetch a post with alt text that has emojis and is longer than 512 characters, example: https://hollo.x27.one/@aliceif/019782b1-c610-7340-bbd2-130db8e5fa2f
- It is not fetched, you can see the Database error in the logs.
💻 Frontend Environment
* Model and OS of the device(s): Asus Vivobook OLED M3401QC - Windows 11
* Browser: Microsoft Edge
* Server URL: mkultra.x27.one
* Misskey: 2025.6.3
🛰 Backend Environment (for server admin)
* Installation Method or Hosting Service: Manual from git
* Misskey: 2025.6.3
* Node: 22.16
* PostgreSQL: 17
* Redis: 8.0.2
* OS and Architecture: Ubuntu 24.04.2 LTS, ARM64
Do you want to address this bug yourself?
- [ ] Yes, I will patch the bug myself and send a pull request
I think more specifically this is related to emojis made out of multiple codepoints such as 👩🏻💻or in the example post🏳️⚧️ and how packages/backend/src/misc/truncate.ts uses stringz, which counts by grapheme and not by Unicode codepoint.
triage: 以下のような状況でしょうか?
- frontend: 512 書記素クラスタ に切り捨てる
- backend の DB: 512 コードポイント までで表せる文字列を期待
- 症状: 2コードポイント以上で表される1書記素クラスタを入力して512書記素クラスタを持つ文字列をaltに設定するとDBエラーになる
- 改善するべき点: 書記素クラスタではなくコードポイントで数えるべき
Yes, that sounds correct. Except it happened to me not with a file created in the frontend but from an incoming message from another server (non-Misskey) which appeared as failed in the logs.