selectorgadget
selectorgadget copied to clipboard
just made some reference updates
i saw there wasn't much activity on this project for a little bit, and before i take it in a more specific and separate direction, i figured you guys might want to pull the commits fro my master branch.
Thanks for working on this. I'll take a look at these. What is the new direction you're talking about?
templating for batched scraping and data transformation. sounds sketchy but isn't, just for b2b/research needs. workflow is like so: define the "equivalent" of a recordset in the dom, define it's fields fields, associate each field with a db table and column, and then save the generated process into a separate system for batching across multiple pages on a domain.
it's basically a scrape generator for a layman user.
i did add refs to my github master for production (to take the step/need/load away from your hosted server for production), you'll want to update it to replace naterkane
w/ iterationlabs
That's something that I've played with from time to time too. I also have a more advanced version of selectorgadget that I'll get online sometime soon.
hi @naterkane, @cantino, I've just discovered this great bookmarklet and am thinking of using it for creating web pages wrappers more easily. Very similar to what @naterkane implies in his comment above. I saw @cantino 's advanced branch but haven't tested it yet.
So my question is:
- has this scrape generator idea been implemented somewhere by some of you?
- what's inside @cantino 's advanced branch
- what's the direction of selectorgadget?
Thanks.
The advanced branch has a wizard.js helper which could be used in a scraper generator. The current direction is slow. :smile: I'm hoping to release a Chrome extension sometime soon.
Thanks. I'll have a look at the advanced branch.
So, as it turns out, @cantino, you were also behind the Parsley language with @fizx ? I should have guessed! :) I started a "pure Python" (well, there's still libxml there) implementation of Parsley (https://github.com/redapple/parslepy), and the natural next step is wanting to create Parsley scripts from within the browser, and I immediately thought of SelectorGadget, then I saw you have also written JSONedit for building JSON objects... You had it all figured out, already ;) I definitely have to look into SelectorGadget in detail, creating groups of objects, scoping and other bucketed arrays etc. Cheers,
It's like a party here.
Haha, yea! Nice work on your Python parsley port!
On Sunday, June 23, 2013, Kyle Maxwell wrote:
It's like a party here.
— Reply to this email directly or view it on GitHubhttps://github.com/cantino/selectorgadget/pull/2#issuecomment-19887494 .