DiscordChatExporter
DiscordChatExporter copied to clipboard
Allow multiple output formats
Hey, I really like the html version for the ease of use, but the json version for better processing.
As scraping the discord might be a bad thing I'd rather avoid running it twice for everything.
if I specify -f json -f html
I however get the error:
Target type is not enumerable and can't accept more than one value.
Other possible solution would be to have a command making a html file from the json in an offline fashion.
If you could login via a "standard" bot token (not a selfbot), my Bash script might help you: https://github.com/Tyrrrz/DiscordChatExporter/issues/264
It parallelizes exporting a whole Guild using all available cores to accelerate the process. In this case, you may need exporting only a channel (or various of them) in both JSON and HTML.
If this were to be implemented, how should we handle output file naming? Say we are doing an export for a single format. Using the CLI, we specify a full output path for the output file (includes path to the directory + a name for the file, including file extension) using -o
. Here's how we would do that:
-c <channel id> -t <token> -o C:\Users\Me\Desktop\hello.html -f HtmlDark
OK, looks good.
Now say we want to output to multiple formats, and I want to provide names for the files. It doesn't make sense for me to provide a file extension, since each file will have a different extension to match their format.
-c <channel id> -t <token> -o C:\Users\Me\Desktop\hello -f htmldark json plaintext
Here, I'd like to get hello.html
, hello.json
, and hello.txt
, but, uh-oh, hello
is going to be interpreted as a directory, not a filename. The program will compute default filenames for the files and then stick them into the hello
directory inside the desktop.
What should we do about this?
Do we provide a separate option for directory path, -d
(what happens if a full path is provided for -o
, but it is incongruent/unreconcilable with the value for -d
)?
Do we ask the user to stick a placeholder value in for the file extension (more complicated)?
Do we ask the user for 3 values for -o
to match the 3 values given to -f
(that's a lot of copy-pasting of the directory path!).
I'd be happy to implement this feature but would like some guidance on the above. I personally am biased to having the new option flag, -d
.
I think C:\Users\Me\Desktop\hello
could be interpreted as a file, and C:\Users\Me\Desktop\hello\
as a folder. This has an obvious disadvantage: Too little margin of error
Do we provide a separate option for directory path,
-d
(what happens if a full path is provided for-o
, but it is incongruent/unreconcilable with the value for-d
)?
-d
could take priority over -o
. So, if -d
is specified, -o
could just be the output filename(s), without extension
I think C:\Users\Me\Desktop\hello could be interpreted as a file, and C:\Users\Me\Desktop\hello\ as a folder. This has an obvious disadvantage: Too little margin of error
This is something I did not consider. What I do like about it is that it keeps all the output info in one place (i.e. one option), but yeah its behavior wouldn't be obvious and the user could confuse it and mess up.
Having the separate option -d
(or maybe -od
, to keep it more related to -o
) makes this option of specifying a directory more discoverable. It could still be potentially confusing to have the behavior of -o
modified by another option.
I am not particularly biased to either option, but I think I'll go with the former (ending with \
to signal a directory). Maybe that will change as I am tinkering with it or more opinions on the subject are provided.
Now that I think about it again, I am not sure the \
approach is super sound. Unless we make this change only effect multiple-format exports, it would change how single-format exports work. To my understanding, the app currently interprets C:\Users\Me\Desktop\hello
as a directory, not a hello
file in Desktop
. Would this be a bad breaking change?
edit:grammar
What I do like about it is that it keeps all the output info in one place (i.e. one option), but yeah its behavior wouldn't be obvious and the user could confuse it and mess up.
Exactly
It could still be potentially confusing to have the behavior of
-o
modified by another option.
Hmm, without -d
default folder could be current working directory (i.e. ./
in Unix), and with -d
, the specified folder
EDIT: Full command would be something like this:
# Output = Files "/path/to/output/folder/Filename.json" and "/path/to/output/folder/Filename.html"
-c <channel id> -t <token> -d "/path/to/output/folder" -o "Filename" -f json -f html
# Output = Files "./Filename.json" and "./Filename.html"
-c <channel id> -t <token> -o "Filename" -f json -f html
Now that I think about it again, I am not sure the
\
approach is super-sound. Unless we make this change only effect multiple-format exports, it would change how single-format exports work. To my understanding, the app currently interpretsC:\Users\Me\Desktop\hello
as a directory, not ahello
file inDesktop
. Would this be a bad breaking change?
Yup, any change would break others scripts... But that could be solved if they get notified
Would this be a bad breaking change?
Yes, it would:
https://github.com/Tyrrrz/DiscordChatExporter/blob/09acfcff59b0b63481fcc8d47958fe8a9c175ae9/DiscordChatExporter.Domain/Exporting/ExportRequest.cs#L80-L85
I would suggest that (in case of multi-format export) the user needs to provide a directory to -o
and if the provided output is not a directory, then an error should be thrown.
If you are exporting multiple files, treating lack of extension in the path as a directory is reasonable, since it doesn't make sense to provide a file name in such a case.
I'm gonna bump this with a suggestion:
Instead of specifying it multiple times or having multiple args, they could be comma separated values.
So if you wanted csv and json, you'd do -f csv,json
. If you wanted htmldark and json, you'd do -f htmldark,json
.
You can also do -f csv json
instead of -f csv -f json
, but you can't do -f csv,json
. This is how the underlying CLI library was designed.
You can also do
-f csv json
instead of-f csv -f json
, but you can't do-f csv,json
. This is how the underlying CLI library was designed.
But will this make separate request for csv then json? Being sus to discord?
Any progress to make multiple export on the GUI without re requesting data on each formats?
But will this make separate request for csv then json?
The whole point is to avoid it.
But will this make separate request for csv then json?
The whole point is to avoid it.
I know. I was asking if this has been done now in the current version. If not yet,
I was thinking of making my own parser for the json or html files. Someone has already made a script to parse html to IRC grepable.
No, in the current version this is not supported.
@Tyrrrz Putting up a $15 bounty to your buymeacoffee if this gets added, would be invaluable for archival.
While not a solution to the issue, I thought I'd share this bash oneliner I made:
jq '.messages[] | .timestamp[11:16] + .author.nickname + (if .type != "Default" then " {" + .type + "}" else "" end) + ": " + .content + (if .attachments[0] then (.attachments[] | "\n" + .fileName + " > " + .url) else "" end)' *.json | sed 's/\"/"/g' | sed 's/^"\(.....\)\(.*\)"$/[\x1b[36m\1\x1b[0m] \x1b[31m\2/g' | sed 's/\\n/\n /g' | sed 's/\(: \|{\)/\x1b[0m&/g' | sed 's/{.*\?}/\x1b[33m&\x1b[0m/g'
It takes a JSON format export and formats it to be readable in a terminal:
Closing as out of scope since the project no longer accepts new features