min
min copied to clipboard
Better support for web scraping
Hey, First off, I have been having a lot of fun with this language. It makes me think about what I write, and I haven't had that feeling in years.
One shortcoming I have noticed is the lack of web related tools. You do have a sockets library, however I didn't see an easy way to get the HTML of a website. I feel like it would be nice to have GET and POST built in to the library similar to the way Nim handles them. I honestly feel like the example provided was too much and should be abstracted away by a sigil built into min.
I am personally planning on hooking up nimquery by using a dynamic library. Basically, I want an easy way to get the full text of an HTML page.
Hello! Glad to hear you are enjoying min!
I see... perhaps a wrapper on nim's httpclient library? I al always trying to keep min... well, minimal! But perhaps that could be a useful addition.
On the other hand, something like nimquery would be a bit too specialized I think, but it could make a great dynamic library of course!
I didn't quite understand what you mean with "the example provided was too much and should be abstracted away by a sigil built into min" -- are you referring to the example in the net module?
Yes, I was referring to the get request example with httpbin. I feel like get requests (which are always going to happen on port 80) should be a simple function that takes a URL and returns the string of the webpage.
As a scripting language, this is nice for creating a terminal interface for websites without an API
So, do you think that a wrapper on nim's httpclient library would add too much bloat to Min?
I personally really want it in, but if you don't want it, then I understand.
Hello again,
I think it would be a useful addition to min! Now I only need to wait for a weekend to implement it ;-)
I'll keep this issue open and I'll give it a shot when I get a chance!
Okay. I'll also see if I can get it done before.
Great! Of course pull requests are always welcome :-)
OK... I managed to put together a small http module. I still have to document it before I can release a new version of min, but it's there if you want to try it out!
Thanks for adding this! I haven't tried it out yet, but I read through the source and it appears to be exactly what I wanted.
Close this issue if you want, or keep it open as a reminder to document.
I'm leaving it open for now... I am nearly done with the docs and I also figured I'd add two more operators to start and stop a simple HTTP server -- it may be useful to have an easy way to code a simple API server in min for testing purposes maybe ;)
My preliminary test of wrapping asynchttpserver looks promising, so I'll hopefully release the whole thing today.
Reopening this, it seems a good idea after all now that min has become more of a "batteries included" language.
I know it has been... 6 years 🙈 but I started to think about an xml module with the following symbols:
- (s -> xnode) xparse
- (xnode sl -> s) xget
- (xnode -> (xnode)) xchildren
- (xnode -> dict) xattrs
- (xnode -> s) xtype
- (xnode sl sl -> xnode) xset
- (xnode sl -> xnode) xdelete
- (xnode xnode -> xnode) xpush
- (xnode -> xnode) xpop
- (dict (xnode) -> xnode) xentity
- (s -> xnode) xcomment
- (s -> xnode) xtext
- (s -> xnode) xcdata
- (d -> xnode) xentity
- (xnode -> s) xstring
- (xnode sl -> xnode) xquery
- (xnode sl -> (xnode)) xqueryall
Basically leveraging xmltree, parsexml, and nimquery under the hood. Shouldn't be too hard.
Haha, it's fine. I never lost hope :laughing:
That API looks pretty solid to me, I'd have to think about it more. I believe my original use case 5-6 years ago was to do some lightweight web scaping, and it seems like everything here would be enough to do just that.
...and it's finally done! As of v0.39.0, min now has a new xml module. In the end I implemented less methods because to do things like add children or attributes you can manipulate them with existing APIs, as they are implemented a a quotation and a dictionary, respectively.