imessage-exporter icon indicating copy to clipboard operation
imessage-exporter copied to clipboard

Organize messages and attachments by date

Open fracai opened this issue 2 years ago • 3 comments

Copying all messages to single files in an output directory and all attachments to an output subdirectory can leave a very large number of files in just two directories. This can slow down directory listings, backups, and file loading.

Instead, messages could be broken up by date. Maybe something like: /YYYY/mm/dd/contact/YYYY-mm-dd_contact.format.

Attachments can be referenced by multiple chats, so they should probably keep the same directory format that they start with (/Attachments/xx/xx/uuid-ish/name.format), but they can be aliased or symlinked in to the same directory as the chat file, maybe named according to the timestamp that they were sent (/YYYY/mm/dd/contact/attachment/YYYY-mm-dd_HHMMSS.format)

fracai avatar Jan 20 '23 04:01 fracai

I don't like tucking away the content like this; I like how right now the conversations show up in a nice list of files. I don't think that is going to change.

I don't think that attachments are duplicated or reused in the table. Even if they are, every time imessage-exporter sees a message with an attachment, it makes a copy (if enabled, of course). I don't think that is worth changing either, due to #61.

As you saw in the other tickets, I am already working on building a directory structure to shrink the size of the Attachments root.

ReagentX avatar Jan 20 '23 04:01 ReagentX

I found a handful of legitimately deduplicated attachments.

$ grep -r 'Library/Messages/Attachments' txt | sed 's/.*\.txt://' | wc -l          
18480
$ grep -r 'Library/Messages/Attachments' txt | sed 's/.*\.txt://' | sort -u | wc -l                                                                                     
18457

It probably doesn't amount to a huge amount, but it's at least confirmation that it does happen.

fracai avatar Jan 20 '23 14:01 fracai

Yeah, it does look like it can happen:

SELECT attachment_id, COUNT(*) FROM message_attachment_join GROUP BY attachment_id HAVING COUNT(*) > 1
image
SELECT * FROM message_attachment_join WHERE attachment_id = 60638
image

I still don't think that this matters for imessage-exporter since we create a copy whenever we see an attachment. This is a cool finding though and definitely something to be aware of when using the messages database.

I only have those two samples to go by, but this appears to happen if you copy and paste from one conversation into the other.

ReagentX avatar Jan 20 '23 14:01 ReagentX