teldrive icon indicating copy to clipboard operation
teldrive copied to clipboard

[Bug]: Orphaned Telegram channel messages persist despite successful check --clean

Open xd003 opened this issue 7 months ago • 16 comments

Describe the bug

After restoring a month-old Postgres database for Teldrive, I noticed several orphaned files still present in the associated Telegram channel. To clean these up, I ran the check command using:

docker compose run --rm --entrypoint /teldrive teldrive check --clean --config /config.toml

This command completed successfully without any errors in the terminal. However, the orphaned files in the Telegram channel were not deleted. While reviewing the Teldrive logs, I occasionally encountered the following error:

failed to delete messages  {"error": "callback: rpcDoRequest: rpc error code 400: CHANNEL_INVALID", ...}

Interestingly, uploading and then deleting a new file through the Teldrive WebUI works perfectly, the file gets deleted from both Teldrive and the Telegram channel. This suggests the issue only affects older/orphaned files after database restoration.

Repoduction

  1. Restore an older Postgres backup for Teldrive (in this case, one month old).
  2. Observe that several orphaned files still exist in the linked Telegram channel.
  3. Run the following cleanup command from the directory containing the Docker Compose file:
docker compose run --rm --entrypoint /teldrive teldrive check --clean --config /config.toml
  1. Wait for the process to complete (no error output is shown).
  2. Check the Telegram channel, orphaned files remain undeleted.
  3. Review Teldrive logs, intermittently shows CHANNEL_INVALID error during deletion attempts.

Expected behavior

The teldrive check --clean command should identify and delete orphaned files from the Telegram channel especially after restoring an older database. All messages/files no longer tracked in the database should be properly removed from the channel. The CHANNEL_INVALID error should not occur during this cleanup if the channel is configured correctly and accessible.

Version

v1.6.17

Which Platform are you using?

docker (Linux)

xd003 avatar Jun 12 '25 12:06 xd003

While reviewing the Teldrive logs, I occasionally encountered the following error:

Didn't you get that resolved in #476?

It's of course possible that there's a bug somewhere, but you should start by checking whether these files actually still exist in your database or not.

  1. go to web.telegram.org and browse to the file(s) in question
  2. right click and copy the message url
  3. extract the message id, which is the last part of the url: e.g. https://t.me/c/1111111111/222222 --> 222222
  4. launch psql and check each file with
    SELECT *
    FROM teldrive.files
    WHERE parts @> '[{"id":222222}]';
    

iwconfig avatar Jun 12 '25 13:06 iwconfig

Didn't you get that resolved in #476?

This error is not recurring like that /api/events one which polled the api every minute. Although, i am still noticing it every now and then

psql -U xd003 -d teldrive -c "SELECT * FROM teldrive.files WHERE parts @> '[{\"id\":328472}]';"

 name | type | mime_type | size | user_id | status | channel_id | parts | created_at | updated_at | encrypted | category | id | parent_id
------+------+-----------+------+---------+--------+------------+-------+------------+------------+-----------+----------+----+-----------
(0 rows)

I verified the orphan file using the SQL command above. Each of my database dumps includes a timestamp in the filename, and in the Telegram channel, I have files dating up to two weeks after the date of the currently restored dump. These newer files are not present in the restored database, clearly marking them as orphans. Moreover, they do not show up in the Teldrive WebUI search, further confirming this. The WebUI only displays files up to the exact date when the current database dump was taken.

xd003 avatar Jun 12 '25 14:06 xd003

This error is not recurring like that /api/events

Yeah, that's not the issue i'm talking about. You brought up this issue there as well and I got the impression that it was resolved as well as the /api/events request.

I verified the orphan file

sounds like a bug then. Have you uploaded files to more than one channel?

iwconfig avatar Jun 12 '25 15:06 iwconfig

sounds like a bug then. Have you uploaded files to more than one channel?

Yes, there are 2 channels at the moment

xd003 avatar Jun 12 '25 15:06 xd003

~~Try teldrive check --clean this image: ghcr.io/iwconfig/teldrive:1.6.17-checktest~~

Yeah @divyam234, i've done some testing and it seems cp.orphanMessages or msgMap doesn't get populated at all.

https://github.com/tgdrive/teldrive/blob/7b902778af78953b9a22b01580dcbdbf4c2e6c91/cmd/check.go#L241-L246

I think? The log doesn't tell me much.

iwconfig avatar Jun 12 '25 17:06 iwconfig

The last time I used this script - https://gist.github.com/danieleg1/1427c2a535b4dd4a6775e0fa82431400, It successfully cleaned up orphaned files from my Telegram channel. However, I'm a bit hesitant to use it now, as it may not have been updated to remain compatible with recent upstream changes in Teldrive.

xd003 avatar Jun 14 '25 12:06 xd003

I've skimmed through the script, and cannot find anything obvious that would indicate incompatibility. In any case, you're already making backups. Dump the db before trying the script.

I don't think my docker image actually fixes your issue but you could try it anyway. It might fix the CHANNEL_INVALID error though, but I'm not sure. I don't know Go enough.

iwconfig avatar Jun 14 '25 13:06 iwconfig

I've skimmed through the script, and cannot find anything obvious that would indicate incompatibility. In any case, you're already making backups. Dump the db before trying the script.

Technically, I could just use a simple 10-line Telethon script to delete all messages in the channel up to the message_id corresponding to when my database dump was restored. But I was hoping that if Teldrive eventually fixes this issue, I'd have the opportunity to properly test it. Right now, I'm not entirely sure if this issue can be reliably reproduced just by manually uploading a file to the channel (thereby making it an orphan) and then running the teldrive check --clean command to see if it gets cleaned up. Logically, it isn't any different from my case though

I will test it out on a separate/fresh teldrive instance and report back

xd003 avatar Jun 14 '25 13:06 xd003

The invalid channel error is a separate issue. My image might fix that, although i am not sure as i am not able to reproduce that specific error.

The issue you're describing in the title of this issue, that orphaned telegram channel messages persist, is reproducible, because i just did it. It does in fact not remove orphans for me either; I know this because i tested it myself.

  • Upload some file
  • Remove the record in teldrive.files table:
    delete from teldrive.files where id = 'some-uuid';
    
  • Observe that the file is not visible in Teldrive UI
  • Run teldrive check --clean
  • Observe that nothing happens
    • No log message
    • The message/file still exist in telegram channel
    • No exported results

Doing the opposite:

  • Upload some file
  • Remove the file in the telegram channel
  • Observe that the file is visible in Teldrive UI
  • Run teldrive check --clean
  • Observe that something happens
    • select * from teldrive.files; show that the status of the file has changed to pending_deletion
      • And the cron job is expected to handle it from hereon
    • Results are exported to results.json
      • A side note: for some reason the json contains duplicate entries of removed file:
        [
          {
            "channel_id": 1234567890,
            "timestamp": "2025-06-14T16:00:19+02:00",
            "file_count": 2,
            "files": [
              {
                "id": "01976ebc-a861-70cc-8b3d-b7adc75954e7",
                "name": "somefile"
              },
              {
                "id": "01976ebc-a861-70cc-8b3d-b7adc75954e7",
                "name": "somefile"
              }
            ]
          }
        ]
        

Technically, I could just use a simple 10-line Telethon script to delete all messages in the channel up to the message_id corresponding to when my database dump was restored. But I was hoping that if Teldrive eventually fixes this issue, I'd have the opportunity to properly test it.

Sure, absolutely, but i mean, if that's the case, then your https://github.com/tgdrive/teldrive/issues/483#issuecomment-2972730870 is irrelevant here.

iwconfig avatar Jun 14 '25 14:06 iwconfig

Thanks for the comprehensive testing, hopefully it will help when this thread gets noticed.

Sure, absolutely, but i mean, if that's the case, then your #483 (comment) is irrelevant here.

Haha, true! It didn’t occur to me at the time that I could simply skip Teldrive’s orphan check altogether and delete the files manually, especially since they’re all in a continuous sequence.

xd003 avatar Jun 14 '25 16:06 xd003

It didn’t occur to me at the time that I could simply skip Teldrive’s orphan check altogether and delete the files manually,

Yes, but that's not what i meant. Of course you can do that, but if you want to have this bug in Teldrive fixed before doing anything (which you said you were hoping for), then the script you linked to should of course not be used. It is thus irrelevant unless it is determined that this bug will never be fixed, and I doubt that it will.

I mean, you give off mixed signals, because initially you did imply that you wanted to use the script by saying

However, I'm a bit hesitant to use it now, as it may not have been updated to remain compatible with recent upstream changes in Teldrive.

and I only say this because you countered my comment about using the script, even though you were the one who brought it up, that's all.

Anyway, yeah, I'll perhaps look more into what the issue might be later, maybe. If i have the time and feel up to it.

iwconfig avatar Jun 14 '25 18:06 iwconfig

Hello @divyam234 any chance this issue could be looked into, it has been confirmed by multiple users including myself.

xd003 avatar Jun 21 '25 10:06 xd003

I just wanted to report that in the latest version of Teldrive (v1.6.18), I'm seeing the following logs:

2025/06/22 20:44:01 goose: no migrations to run. current version: 20250408170839
22/06/2025 08:44 PM	INFO	[DB] github.com/tgdrive/teldrive/pkg/cron/cron.go:47
[17.584ms] [rows:1] SELECT count(*) FROM information_schema.tables WHERE table_schema = 'teldrive' AND table_name = 'cron_job_locks' AND table_type = 'BASE TABLE'
22/06/2025 08:44 PM	INFO	[DB] github.com/tgdrive/teldrive/pkg/cron/cron.go:47
[19.583ms] [rows:0] CREATE TABLE "teldrive"."cron_job_locks" ("id" bigserial,"created_at" timestamptz,"updated_at" timestamptz,"job_name" text,"job_identifier" text,"worker" text NOT NULL,"status" text NOT NULL,PRIMARY KEY ("id"))
22/06/2025 08:44 PM	INFO	[DB] github.com/tgdrive/teldrive/pkg/cron/cron.go:47
[2.594ms] [rows:0] CREATE UNIQUE INDEX IF NOT EXISTS "idx_name" ON "teldrive"."cron_job_locks" ("job_name","job_identifier")
22/06/2025 08:44 PM	INFO	Server started at http://localhost:8080
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[10.483ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:06.082' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[4.399ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:11.086' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[1.166ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:16.086' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[0.960ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:21.086' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[0.943ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:26.086' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[0.889ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:31.086' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[1.266ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:36.084' and status = 'FINISHED'
22/06/2025 08:44 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[1.014ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:44:41.087' and status = 'FINISHED'
22/06/2025 08:50 PM	INFO	[DB] github.com/go-co-op/gocron-gorm-lock/[email protected]/gorm_lock.go:81
[0.881ms] [rows:0] DELETE FROM "teldrive"."cron_job_locks" WHERE updated_at < '2025-06-21 20:50:41.082' and status = 'FINISHED'

This section of the log is being repeated every 5 seconds. I'm not sure if that's the intended behavior.

xd003 avatar Jun 22 '25 15:06 xd003

Not sure if this is a different issue in context of latest update but I am still seeing these errors -

failed to delete messages	{"error": "callback: rpcDoRequest: rpc error code 400: CHANNEL_INVALID", "errorVerbose": "callback:\n    github.com/gotd/td/telegram.(*Client).Run.func3\n        github.com/gotd/[email protected]/telegram/connect.go:174\n  - rpcDoRequest:\n    github.com/gotd/td/mtproto.(*Conn).Invoke\n        github.com/gotd/[email protected]/mtproto/rpc.go:44\n  - rpc error code 400: CHANNEL_INVALID"}

and orphaned files are still not deleted from the Telegram channel

xd003 avatar Jun 23 '25 05:06 xd003

@iwconfig Just for info, are you able to reproduce this issue still or its just me

xd003 avatar Jun 24 '25 04:06 xd003

Sorry, I'm not, and this is difficult to replicate, and i don't get the CHANNEL_INVALID error (which, as i've said, I believe to be a separate issue).

Quick test: No issue for me (see this spoiler), but I guess it may depend on the reason why these files are marked for cleanup
root@dev:~/teldrive-testbench# docker compose run --rm --entrypoint /teldrive teldrive-server version
[+] Creating 1/0
 ✔ Container teldrive-db  Running                                                                                                                                                                             0.0s 
teldrive 1.6.18
- commit: 447a79c
- os/type: linux
- os/arch: amd64
- go/version: go1.24.4
root@dev:~/teldrive-testbench# touch results3.json && docker compose run --rm -v $PWD/results3.json:/results.json --entrypoint /teldrive teldrive-server check --clean
[+] Creating 1/0
 ✔ Container teldrive-db  Running                                                                                                                                                                             0.0s 

Channel redacted: No files found                                                             ... done! [55ms]
Channel redacted: No files found                                                             ... done! [62ms]
Channel redacted: No files found                                                             ... done! [63ms]
Channel redacted: No files found                                                             ... done! [1ms]
Channel redacted: No files found                                                             ... done! [0s]
Channel redacted: No files found                                                             ... done! [0s]
Channel redacted: Complete                                                                   ... done! [497ms]
24/06/2025 02:08 PM     INFO    Exported data to results.json
root@dev:~/teldrive-testbench# cat results3.json 
[
    {
        "channel_id": redacted,
        "timestamp": "2025-06-24T14:08:59+02:00",
        "file_count": 9,
        "files": [
            {
                "id": "01977a38-6c08-7ae1-807b-588548b86e95",
                "name": "test.txt"
            },
            {
                "id": "01979260-a5c8-72f1-bd5f-8024013403e4",
                "name": "test5.txt"
            },
            {
                "id": "01979270-ba9c-7d9d-8a68-3e15e13213c8",
                "name": "test11.txt"
            },
            {
                "id": "019793d2-ce1c-7d2f-8dda-847a480b567b",
                "name": "test12.txt"
            },
            {
                "id": "019793ff-1b65-7a83-aa34-f02bab258817",
                "name": "test13.txt"
            },
            {
                "id": "01979430-ae67-7b37-a9e4-6eb75ed70187",
                "name": "test14.txt"
            },
            {
                "id": "01979438-5ba2-744d-8278-bfa059b25f5a",
                "name": "test15.txt"
            },
            {
                "id": "01979456-e68c-795c-ad5c-8d21549135f3",
                "name": "test16.txt"
            },
            {
                "id": "01979458-b5bf-708b-bc32-5532577bf4c9",
                "name": "test17.txt"
            }
        ]
    }
root@dev:~/teldrive-testbench# awk -F\" '/^\s*clean-files-interval/{print $0}' teldrive/config.toml
clean-files-interval = "1m"
root@dev:~/teldrive-testbench# date
Tue Jun 24 03:30:21 PM CEST 2025
root@dev:~/teldrive-testbench# docker compose exec -T teldrive-db psql \                                                                                                                                           
  -d "$(awk -F\" '/^\s*data-source/{print $2}' teldrive/config.toml)" \                                                                                                                                            
  -c "select id,name,status from teldrive.files where id in ($(jq -r '.[].files|map(.id)|map(@sh)|join(",")' results3.json));"                                                                                     
                  id                  |    name    |      status                                                                                                                                                   
--------------------------------------+------------+------------------                                                                                                                                             
 01979270-ba9c-7d9d-8a68-3e15e13213c8 | test11.txt | pending_deletion                                                                                                                                              
 01979260-a5c8-72f1-bd5f-8024013403e4 | test5.txt  | pending_deletion                                                                                                                                              
 01977a38-6c08-7ae1-807b-588548b86e95 | test.txt   | pending_deletion                                                                                                                                              
 019793d2-ce1c-7d2f-8dda-847a480b567b | test12.txt | pending_deletion                                                                                                                                              
 019793ff-1b65-7a83-aa34-f02bab258817 | test13.txt | pending_deletion                                                                                                                                              
 01979456-e68c-795c-ad5c-8d21549135f3 | test16.txt | pending_deletion
 01979430-ae67-7b37-a9e4-6eb75ed70187 | test14.txt | pending_deletion
 01979458-b5bf-708b-bc32-5532577bf4c9 | test17.txt | pending_deletion
 01979438-5ba2-744d-8278-bfa059b25f5a | test15.txt | pending_deletion
(9 rows)

root@dev:~/teldrive-testbench# date
Tue Jun 24 03:31:47 PM CEST 2025
root@dev:~/teldrive-testbench# docker compose exec -T teldrive-db psql \
  -d "$(awk -F\" '/^\s*data-source/{print $2}' teldrive/config.toml)" \
  -c "select id,name,status from teldrive.files where id in ($(jq -r '.[].files|map(.id)|map(@sh)|join(",")' results.json));"
 id | name | status 
----+------+--------
(0 rows)

And the log doesn’t spew any errors.

@xd003 actually yes, sorry, this is still an issue:

  • Upload some file

  • Remove the record in teldrive.files table:

    delete from teldrive.files where id = 'some-uuid';
    
  • Observe that the file is not visible in Teldrive UI

  • Run teldrive check --clean

  • Observe that nothing happens

    • No log message
    • The message/file still exist in telegram channel
    • No exported results

But not the opposite, just like I observed here: https://github.com/tgdrive/teldrive/issues/483#issuecomment-2972786525

iwconfig avatar Jun 24 '25 13:06 iwconfig