Ben Muthalaly
Ben Muthalaly
I tried to find existing tools to extract these files, but haven't had success yet. > One simple solution we could do is run all the URLs in found in...
I've checked quite a few websites for test cases and can't find any that directly link to 3d assets. I could be looking in the wrong places though. I'd appreciate...
@pirate pd3f seems like the most capable tool, but [the docs](https://pd3f.com/docs/pd3f/installation/) say that it takes 8GB. Just wanted to check if we're okay with a dependency that heavy before I...
@pirate Should the `media` extractor also handle `gallery-dl`? I'm not sure if the `media` extractor is just an alias for `youtube-dl` or if it's supposed to handle other visual media...
Ah cool! Should I still keep working on a `gallery-dl` branch based off `dev`, or should I wait until the plugin system has merged?
@pirate Couple of questions about the implementation for this: - Should the majority of the logic be implemented in JS or Python? I've gotten something mostly working in JS, but...
I have a rough script for this working (just using `scihub.py` as a module and downloading a pdf with the url->doi fallback). I had to slightly modify `scihub.py` to get...
Here's what I have so far https://gist.github.com/benmuth/b2c12cbb40ca4d8183c6f17f819e2f2d @pirate Usage: ``` python scihub-extractor.py -d -o ``` or ``` python scihub-extractor.py -f -o ``` It should either - download the paper directly...
> @benmuth it might take me a month or more till I'm able to merge this, as I'm working on some paid ArchiveBox projects right now for a client that...
@pirate Yeah, I think that's a great idea, I'd be happy to try to work on this. I think a more comprehensive tool should definitely exist. Thanks for the overview,...