DiscordChatExporter
DiscordChatExporter copied to clipboard
Render threads
Flavor
No response
Export format
No response
Details
Discord recently unveiled Threads, https://blog.discord.com/connect-the-conversation-with-threads-on-discord-3f5fa8b0f6b, and they are currently not exported when exporting a channel that has them
Need to also figure out how to render them. Discord effectively considers them separate entities from the channel itself, but I don't think that would be very convenient in an export -- I think that threads should be part of the same file.
Yeah, this is an interesting issue. Threads essentially act as temporary channels, but as they're tied to a message in their parent channel, I agree that they should probably be part of the same file. Perhaps when we encounter a thread we could resolve all of its messages before proceeding and render them within an expandable/collapsable element beneath its parent message on the HTML export? If we can grab a full list of threads (including archived ones), we might be able to include a navigation bar somewhere since threads share their ID with the message that started then. Similarly, we could introduce a "thread" field to messages within the JSON export and embed an object. There are definitely a lot of ways to approach this, though.
Also, would it be worth supporting the export of threads individually? I could see users wanting to download one thread conversation without exporting an entire chatlog. I haven't looked enough into the API yet to see if thread objects could easily substitute channels in DCE's export pipeline. Channels can also have enormous amounts of archived threads, so that may be a potential hurdle.
Yeah, this is an interesting issue. Threads essentially act as temporary channels, but as they're tied to a message in their parent channel, I agree that they should probably be part of the same file. Perhaps when we encounter a thread we could resolve all of its messages before proceeding and render them within an expandable/collapsable element beneath its parent message on the HTML export? If we can grab a full list of threads (including archived ones), we might be able to include a navigation bar somewhere since threads share their ID with the message that started then. Similarly, we could introduce a "thread" field to messages within the JSON export and embed an object. There are definitely a lot of ways to approach this, though.
Yeah that sounds reasonable to me. I dread the changes we need to do in HTML to support rendering messages within messages though π¬
Also, would it be worth supporting the export of threads individually? I could see users wanting to download one thread conversation without exporting an entire chatlog. I haven't looked enough into the API yet to see if thread objects could easily substitute channels in DCE's export pipeline. Channels can also have enormous amounts of archived threads, so that may be a potential hurdle.
I would personally say exporting threads separately is unnecessary, but I'm curious what others think.
I think it should act as a message attachment, and should be a separate file put in the same per-channel subdirectory used for media, or maybe a separate folder. Also i think there should be a cli option on whether to include them (maybe options like, include threads= true/false/active/archived)
I think more options are definitely nice, as per @tntmod54321, but that definitely seems like more of an end goal thing than a first implementation.
Personally, I'd vote for a drop-down collapsible frame to start, then maybe work up to individual threads as files at some point. This does seem like an immensely complex issue though, and I definitely understand there needs to be a proper balance struck between simplicity and functionality.
I looked some more into the API, and it turns out that threads are represented as channels with a few extra fields. It looks like you can retrieve information from them using the same endpoints that you use to retrieve channels.
In fact, I just attempted to export a thread with the current version of the exporter. It actually succeeded, and the results (thread_export.zip) were rather interesting.
Some things of note:
- The thread's "category" is actually the parent channel.
- The message log starts with a special message with type 21 (THREAD_STARTER_MESSAGE) and no content. It references the parent message in the parent channel. Surprisingly, this renders quite nicely on the HTML already and breaks nothing.
- Thread names don't follow channel name restrictions. I don't know if DCE relies on those restrictions anywhere, but the above thread exported fine despite containing spaces so I would assume not.
- System thread messages such as "Changed the channel name." and "Removed a recipient." render like all other system messages without a hitch.
- Despite what's implied by the text in the above export, it seems that the user can export the thread (if it's public) without necessarily being a member. I removed the bot and the next export worked fine.
With this in mind, it may actually be immensely easier to render threads as if they were channels (with maybe a few tweaks) in their own file. Most of it is already taken care of thanks to their implementation in the Discord API. Also, any message which is the parent of a thread contains a field with a channel object representing the thread, so it would also probably be no issue to create something similar to a message attachment per @tntmod54321's suggestion.
I looked some more into the API, and it turns out that threads are represented as channels with a few extra fields. It looks like you can retrieve information from them using the same endpoints that you use to retrieve channels.
In fact, I just attempted to export a thread with the current version of the exporter. It actually succeeded, and the results (thread_export.zip) were rather interesting.
Some things of note:
- The thread's "category" is actually the parent channel.
- The message log starts with a special message with type 21 (THREAD_STARTER_MESSAGE) and no content. It references the parent message in the parent channel. Surprisingly, this renders quite nicely on the HTML already and breaks nothing.
- Thread names don't follow channel name restrictions. I don't know if DCE relies on those restrictions anywhere, but the above thread exported fine despite containing spaces so I would assume not.
- System thread messages such as "Changed the channel name." and "Removed a recipient." render like all other system messages without a hitch.
- Despite what's implied by the text in the above export, it seems that the user can export the thread (if it's public) without necessarily being a member. I removed the bot and the next export worked fine.
With this in mind, it may actually be immensely easier to render threads as if they were channels (with maybe a few tweaks) in their own file. Most of it is already taken care of thanks to their implementation in the Discord API. Also, any message which is the parent of a thread contains a field with a channel object representing the thread, so it would also probably be no issue to create something similar to a message attachment per @tntmod54321's suggestion.
Nice, thank you for looking into this.
I can see the following pros/cons of both approaches so far...
Render threads as separate files
+ Fewer changes needed in HTML export template - Worse user experience having to navigate to a different file (even if a link is inside the export) to see the thread - Probably not very nice to do with JSON format because you would expect that to be fully self-contained as far as data goes
Render threads inside
+ Better user experience across the board - A lot of work on HTML side (to make it worse, we'll have to come up with something ourselves because Discord itself doesn't actually render the threads inline; maybe some minimal message format?) - Things to consider, like how should embeds or attachments look (should they be visible at all?) etc. - If the user doesn't want the thread to be exported with the channel (size concerns or whatnot), then we'd have to introduce an option and add branching logic (which is undesirable)
Note that if the user wants to export a thread as a separate channel, they would be able to do that still. We might just show threads in the list of channels (maybe expandable under channels themselves). Although their number might get really high over time because they don't disappear even if archived, so maybe that's not a great idea.
At least on CLI it should be fine to add a separate command to get list of threads. On GUI we might have to come up with something creative.
Worse user experience having to navigate to a different file (even if a link is inside the export) to see the thread
Some users might prefer this. There are benefits of reduced filesize and more organization. I do agree that it feels rather useless in the JSON export, however. I'd be in support of at least keeping this as an option, especially considering that it seems to be pretty simple to implement.
A lot of work on HTML side (to make it worse, we'll have to come up with something ourselves because Discord itself doesn't actually render the threads inline; maybe some minimal message format?)
Is there anything to stop us from rendering messages normally except embedded in a collapsable div? We could also cop out and inject an
We might just show threads in the list of channels (maybe expandable under channels themselves). Although their number might get really high over time because they don't disappear even if archived, so maybe that's not a great idea.
If I remember correctly, Discord exposes separate endpoints for active and archived threads. Maybe we can load the active ones by default and have some user interaction to open the archived ones.
One issue I see arising is the use of partitions, since threads and their parents have separate chat histories. What should happen if a thread starts within a date partition but contains messages that span for months? Or if a filesize partition is met while embedding a thread? Even if we store threads in external files, linking to them could be unhelpful if the thread itself is partitioned.
Is there anything to stop us from rendering messages normally except embedded in a collapsable div? We could also cop out and inject an
We do have some complicated styles there, so I'm not confident it will work out that easily. But it's worth trying π I would prefer that over iframe for sure.
If I remember correctly, Discord exposes separate endpoints for active and archived threads. Maybe we can load the active ones by default and have some user interaction to open the archived ones.
Could be, although arguably the archived ones are the ones that a user would most likely want to export.
We do have some complicated styles there, so I'm not confident it will work out that easily.
I just opened a random export and wrapped a bunch of messages within a div just to see how it'd look. As far as I can tell, no styling broke. Here's a peek:

Though not pictured, embeds and image attachments rendered fine.
Could be, although arguably the archived ones are the ones that a user would most likely want to export.
While I agree, archived threads are also the ones that will most likely be enormous in count. Guilds have limits on the number of total active threads, but there can be unlimited archived ones. Discord seems to provide pagination for archived threads, so we might have to implement that too if the number really gets excessive.
We do have some complicated styles there, so I'm not confident it will work out that easily.
I just opened a random export and wrapped a bunch of messages within a div just to see how it'd look. As far as I can tell, no styling broke. Here's a peek:
Though not pictured, embeds and image attachments rendered fine.
Could be, although arguably the archived ones are the ones that a user would most likely want to export.
Nice! This looks promising.
While I agree, archived threads are also the ones that will most likely be enormous in count. Guilds have limits on the number of total active threads, but there can be unlimited archived ones. Discord seems to provide pagination for archived threads, so we might have to implement that too if the number really gets excessive.
I guess we need to descope it for now and focus on what's going to provide the most value. I think we should just try rendering threads inline for all formats and take it from there, depending on feedback. Let's not add any options for now. Exporting a specific thread only as a channel would then be possible through the CLI but not from GUI.
Render threads inside
- Better user experience across the board
I disagree with this a lot unless there's some kind of table of contents at the top for helping to find where the threads are by letting you click on links to go to the threads. Otherwise they are just in different parts of a potentially huge message log that you got from exporting a single channel, and you can't find them unless you know what's in them enough to know what to search for or spend enough time searching, and it's even worse if the threads are collapsed by default. Even if you put the thread at the bottom of the channel's file, that still could be a mess to navigate if any of the channel's threads are too long. I think in order to provide a good user experience for threads they need to all be separate files or you need a table of contents type feature that gives you links at the top to click on to go to the thread you're looking for.
Render threads inside
- Better user experience across the board
I disagree with this a lot unless there's some kind of table of contents at the top for helping to find where the threads are by letting you click on links to go to the threads. Otherwise they are just in different parts of a potentially huge message log that you got from exporting a single channel, and you can't find them unless you know what's in them enough to know what to search for or spend enough time searching, and it's even worse if the threads are collapsed by default. Even if you put the thread at the bottom of the channel's file, that still could be a mess to navigate if any of the channel's threads are too long. I think in order to provide a good user experience for threads they need to all be separate files or you need a table of contents type feature that gives you links at the top to click on to go to the thread you're looking for.
Interesting. Would you be exporting the channel specifically for the threads? Why would you need to find them? Note it would also be possible to export the threads separately if you don't want the channel where they were created.
Interesting. Would you be exporting the channel specifically for the threads? Why would you need to find them?
One of the principal advantages of threads, in my opinion, is their ability to organize the chat and make it easy to find/reference conversations. I too feel that's a bit lost without some sort of navigation function, so I'd be in favor of implementing something like @ureru's suggestion in the HTML.
Interesting. Would you be exporting the channel specifically for the threads? Why would you need to find them?
One of the principal advantages of threads, in my opinion, is their ability to organize the chat and make it easy to find/reference conversations. I too feel that's a bit lost without some sort of navigation function, so I'd be in favor of implementing something like @ureru's suggestion in the HTML.
An important technical limitation to consider is that we can only add a TOC at the end because the export process is streaming and we don't know ahead of time whether there will be threads or not.
An important technical limitation to consider is that we can only add a TOC at the end because the export process is streaming and we don't know ahead of time whether there will be threads or not.
Discord does expose endpoints for retrieving the lists of threads if we want to reference that. The only issue I could think of is those threads not appearing in the export due to message filtering, as we'd have no way to determine that before querying the message.
How would we expect threads with message filtering to work, anyway? Would the thread simply disappear if its parent message doesn't satisfy the filter (further complicated by orphaned threads)? What about the messages within it?
Discord does expose endpoints for retrieving the lists of threads if we want to reference that. The only issue I could think of is those threads not appearing in the export due to message filtering, as we'd have no way to determine that before querying the message.
Also date ranges would affect that.
How would we expect threads with message filtering to work, anyway? Would the thread simply disappear if its parent message doesn't satisfy the filter (further complicated by orphaned threads)? What about the messages within it?
Good question, I have no idea π My initial hunch is that thread messages should be filtered in the same way too.
Also date ranges would affect that.
Date ranges can be pretty easily filtered just from the ID of the thread. There's nothing about its parent message, though, so we'd have to make a separate request.
Good question, I have no idea π My initial hunch is that thread messages should be filtered in the same way too.
Yeah, I agree that it makes sense to filter the messages within the thread. The parent message seems a bit more concerning, though. If it gets filtered out, the entire thread would just be skipped. Even without the use of filters, a similar phenomenon would probably happen if the message is deleted and the thread is left "orphaned". I'm not sure this is ideal.
Also wanted to bring up the issue of partitioning again, where we would have to clear up what we want to happen if a new file starts in the middle of a thread. I think splitting up the thread is not ideal but our best option here.
I'm confused about why someone would say "Why would you need to find them?" about threads. You're likely exporting channels so you'll have logs you can navigate or that someone can navigate if they want. Threads are 'kind of channels' in their own right, especially because if they auto-archive then anyone can unarchive and continue them at will. Finding the threads might not be as useful as finding the channels in most cases, but it's still useful and it's something I can imagine many people wanting to be able to do.
Threads are 'kind of channels' in their own right
Honestly, I think this is why we're running into so many design complications. Threads really are just a channel, and that's exactly how Discord treats them (both within the API and the GUI). Though rendering them inline does make sense, DCE seems to be built to handle isolated channels right now as it lacks the functionality to link together multiple channel exports from the same server. If we want to do that with threads, it might require a lot of rethinking. We're essentially trying to pair two (or more) channels within a single export, which conflicts with how the program is set up and doesn't work well with features like partitioning because the chatlog is no longer chronological.
I'm not opposed to finding a way to inline threads, but it might be best to start by supporting threads as their own file exports first since we already have almost everything we need for that.
I'm not opposed to finding a way to inline threads, but it might be best to start by supporting threads as their own file exports first since we already have almost everything we need for that.
I fully agree with that. Long term we should offer both options, but since one requires a lot more work, it has to be pushed back. So if we find solutions to offering an index, to partitioning, to rendering it, then we should offer rendering them inline as well which I would prefer for my purposes, but there are also other issues we have that are quite important, so as long as we have the option to rendern threads as sperate channels we have basic support which should be fine for some time.
For what it's worth, I'm likely to export them as separate channels anyway. Because actually even if you have an index at the start of the channel export, you still won't know what all the threads on the server are just looking at your list of channel export files. Which i guess could be considered as a separate suggestion if you guys want, but at that point... Well, it's up to you.
So to summarize the discussion a bit, here are the list of open design questions:
- Should the list of threads available on the server be visible in the GUI (probably under respective parent channels)? This is required if we want to allow exporting threads separately from their channels.
- Should we only show active threads? Or archived threads? Or both? Note that a server may potentially have thousands of threads and that count will only keep growing as the server ages (unlike with regular channels).
- What are other options available to show the list of threads that scales with a potentially infinite number of them?
- Should the threads be inlined in the channel export? This might be really difficult to do with non-HTML export formats.
- Should the export instead just display a thread marker that shows the last message/timestamp, similarly to Discord (but not include the actual thread messages)?
- Should then the actual threads be exported as sidecar files (similarly to how partitions work)?
- What to do about file name or name templates?
- How will this work with partitioning? Should the partitioning also apply to thread exports?
- Should then the actual threads be exported as sidecar files (similarly to how partitions work)?
- Should any of this be a configurable option or can we reach a design that can satisfy everyone in 90% use cases?
- Should the export instead just display a thread marker that shows the last message/timestamp, similarly to Discord (but not include the actual thread messages)?
So far I see this as the path of least resistance:
- Don't show the list of threads anywhere. Exporting individual threads will then only be possible through the CLI.
- Don't inline threads with the export. Instead export them as sidecar files. This makes even more sense for non-HTML exports because inlining threads there may prove complicated.
- Add an option whether the threads should be exported or not. Don't add any other configurable options.
- For file names, use the export file name and inject the thread name somewhere.
- Respect partitioning on sidecar files too.
As a result, the following use cases will be satisfied:
- Export an individual thread by ID -> use CLI's
export -c THREAD_IDcommand, which already works today. - Export a channel with threads -> use CLI or GUI and export a channel like you normally would.
Good summary.
Should we only show active threads? Or archived threads? Or both? Note that a server may potentially have thousands of threads and that count will only keep growing as the server ages (unlike with regular channels).
If that's possible, I think adding two buttons β one to show active and one to show archived threads β would be a good option here. The number of threads that need to be shown at the same could still be really high, but it's not different with channels right now.
Should any of this be a configurable option or can we reach a design that can satisfy everyone in 90% use cases?
Personally, I'm still more on the side of having both options. With exporting them as completely separate files with just the first and last messages shown as in Discord in the original export being the first option that should be released first. Since rendering them completely inline seems to be way more complicated, that option should be ignored for now in my opinion. However, I think that this option should definitely be kept at the back of our minds because I think that in some situations that export option would be way better for a lot of reasons.
Should the export instead just display a thread marker that shows the last message/timestamp, similarly to Discord (but not include the actual thread messages)?
When threads are exported as separate files, I think that would be a necessary addition to normal channel exports.
Don't show the list of threads anywhere. Exporting individual threads will then only be possible through the CLI.
Wouldn't that limit the utility of the GUI massively? And what do you mean by "exporting individual threads will then only be possible through the CLI"? As far as I understood it, exporting threads won't be possible at all with the GUI if we don't show the list of threads anywhere.
Don't inline threads with the export. Instead export them as sidecar files. This makes even more sense for non-HTML exports because inlining threads there may prove complicated.
I do agree with that, but as I said, I'd keep inlining them with the exports in the back of our heads.
Add an option whether the threads should be exported or not. Don't add any other configurable options.
I also agree with that, as I think that's everything we can add to the CLI in regard to threads.
For file names, use the export file name and inject the thread name somewhere.
π
Respect partitioning on sidecar files too.
π
As a result, the following use cases will be satisfied:
I think that in combination with the addition of the thread marker to exports of normal channels that has been described would satisfy everyone in regard to thread exports for now.
I think @Tyrrrz's solution is a good one. In my opinion, it'd be best to start with that and then work forward with any other requests from there. The one thing I would consider is displaying active threads in the GUI, since there is a limit on how many of those can exist in a guild at once.
I think @Tyrrrz's solution is a good one. In my opinion, it'd be best to start with that and then work forward with any other requests from there. The one thing I would consider is displaying active threads in the GUI, since there is a limit on how many of those can exist in a guild at once.
Oh, there is a limit? I didn't know that. How big is it?
I don't actually know how big it is, because the documentation leaves that pretty vague. I'd imagine it's comparable to the channel limit, though it's a separate count. Here are relevant snippets:
Therefore guilds are capped at a certain number of active threads, and only active threads can be manipulated.
Threads do not count against the max-channels limit in a guild, but there will be a new limit on the maximum number of active threads in a guild.
(from https://discord.com/developers/docs/topics/threads#active-archived-threads)
Oh, thereΒ isΒ a limit? I didn't know that. How big is it?
If my information is correct, it's 1000
In its latest release, when we export a channel, does it export all the threads in the channel, along with archived and active threads? If so, maybe adding an argument to save only the active threads would be nice.
In its latest release, when we export a channel, does it export all the threads in the channel, along with archived and active threads?
No, in the latest release you can only export threads individually in the CLI using their IDs.
I really hope this gets supported. I have a couple of servers that are about to retire by the winter soon and most of them are chock full of threads.
I currently have no capacity/desire to implement this, so if this feature is to be added, it will have to be with community's help. Let me know if someone wants to brainstorm or discuss implementation details.
I currently have no capacity/desire to implement this, so if this feature is to be added, it will have to be with community's help. Let me know if someone wants to brainstorm or discuss implementation details.
I've been asked for assistance to backup 7 discord servers that will be retired in the near future. Of course, most of them make heavy use of threads in their discussions.
I'd be willing to spend some time looking into this and see what can be achieved with minimal changes of the current way exports are done. That being said, I just discovered DCE today and haven't had a look at the code base yet.
Any specific parts I should be looking into/at to get a good idea of what would be required to export those threads (most likely as separate files)?
@Ragnarok700 It might be worth looking into how IDs are gathered for channels, as inputting a thread ID into the CLI will properly download the thread (albeit with the channel it's under as it's category). Hope this helps you!
are there any pre-release version allowing to export threads? or are there maybe other project for archiving a disord incl threads? I have never looked for any other. this was my first find and I am very happy with it. until discord introduced threads and in some discords I am in they get heavily used but I cannot archive them.
are there any pre-release version allowing to export threads? or are there maybe other project for archiving a disord incl threads? I have never looked for any other. this was my first find and I am very happy with it. until discord introduced threads and in some discords I am in they get heavily used but I cannot archive them.
Reading the comments above I learnt you can already export threads in the CLI version. To be explicit:
- Download this DiscordChatExporter package
- Follow the documentation. The documentation is really good! You will be happy after reading it :-)
- The relevant command is
export -c THREAD_ID(see docs for full instructions)
This was already mentioned above.
If you need to get a list of threads to use with the above command, then:
- Use Chrome to load Discord in the browser
- Open DevTools (F12), and go to the Network tab
- Click on the "Threads" button in Discord in the browser
- Look in the DevTools and you will see a request to
https://discord.com/api/v9/channels/...../threads/search?archived=true&sort_by=last_message_time&sort_order=desc&limit=25&offset=0 - The "response" or "preview" of this contains all the threads IDs you need.
Note: Other browsers apart from Chrome have the same DevTools functionality.
Hopefully we can do that without the need to open consoles in the future