discord-urls-extractor
discord-urls-extractor copied to clipboard
Rust program for extracting most URLs from Discord scrapes. Works with Discord History Tracker, discard2, and DiscordChatExporter.
https://serde.rs/stream-array.html This would require rewriting the JSON code for DCE to use Serde
this would decrease the RAM usage substantially
These are mainly simple fixes, although the `filefailed` warning is one that I don't know how to fix. Maybe we change the variable directly in the match statement? Or maybe...
Example: `https://sh.rustup.rs](https://sh.rustup.rs/` (cf. https://transfer.archivete.am/CKtm5/discord-Anchor.bad-urls.txt) Valid Markdown, but not valid Discord formatting. The regex needs to take this into acount.
Let's split this up a bit - [ ] Splashes - [ ] Discovery splashes - [ ] Widgets? (are they even available from the websockets?) - [ ] Description...
This will fix the 'duplicate URL' problem. https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html
Right now, some Discord CDN image URLs extracted have their max size by requesting size 4096. There should be an option to output the original ones extracted (and an option...