telegram-json-backup icon indicating copy to clipboard operation
telegram-json-backup copied to clipboard

Messages from some users in group chats are filled with null values.

Open stefanmaric opened this issue 9 years ago • 3 comments

I haven't found what's different with these users, but their messages are filled with null values, for example:

{"mention": false, "flags": 0, "fwd_date": null, "src": null, "action": "ACTION_NONE", "text": null, "out": false, "dest": null, "reply_id": null, "unread": false, "fwd_src": null, "id": 351705, "media": null, "date": null, "service": false}

That corresponds to a regular text message I verified it appears in tg and displays correctly. (which I found grep-ing for the next message in the group chat backup).

The user that sent that message:

  • Is not deleted.
  • Hasn't be kicked from group.
  • Hasn't left the group.
  • Has fist/last name.
  • Has username.
  • Is NOT in my contacts. (but messages form other users not in my contact list are exported correctly).

Any ideas on how to debug this?

stefanmaric avatar Sep 05 '15 19:09 stefanmaric

Interesting. I grepped for "date": null through a full backup of my conversations, and found that 115 of the 171183 messages (that's 0,07%) in my backup are messages like this. No wonder I hadn't noticed!

So yeah I can reproduce this. I can run some experiments to see if I can discover some kind of pattern or cause. It may be something solvable but it could also be a bug in the Python binding and in that case there's not much to be done about it. Unfortunately tg#587 is a complicating factor for debugging this issue; touch the message object in a way it doesn't like and poof, signal received crash.

One question: is your observation that all messages from that user end up like this or is it just some of them?

By the way, I don't know if you're interested in this feature but just in case: today I created an experimental branch with support for saving media files in the backup process. It seems to work for me (although some details such as file extensions are off due to Python binding quirkiness) but needs some more functional and stability testing.

tvdstaaij avatar Sep 05 '15 20:09 tvdstaaij

Hi @tvdstaaij it seems that it is specific to some users, not messages. I discovered it while calculating some stats in a group, when I wanted to see messages per user, I noticed some users were missing.

And indeed, these python bindings are really problematic; I have been trying to do some scripting with tg and it has been really frustrating.

About the experimental branch: I will give it a try and loop back. EDIT: see #3 and #4

stefanmaric avatar Sep 05 '15 23:09 stefanmaric

I did some digging and these are my findings:

  • It is possible for a user to have only a few of their messages missing.
  • There doesn't seem to be a clear pattern, although there are some tendencies. One that especially stands out in my case are bot replies that follow very quickly after a command. The few missing messages from real users tend to be from users I haven't had personal contact with and either left the group or have contributed only a few messages to the group.
  • It's not a processing bug in my script; the messages are already filled with null values when they are delivered in the callback. Interesting enough, contrary to "good" messages debug dumping these "bad" message objects does not crash telegram-cli as a result of tg#587.
  • Subsequent runs give the same amount of invalid messages, so the problem can't be worked around by combining the results of multiple backups.

Conclusion: there is probably something strange about these messages that tgl or the Python binding cannot handle well, and would have to be debugged on that level. I'll keep this issue open but besides trying random things in the hopes of finding a lucky workaround (and there's not even all that much more I can think of) there's not much to be done about this, I'm afraid.

tvdstaaij avatar Sep 11 '15 22:09 tvdstaaij