enlive icon indicating copy to clipboard operation
enlive copied to clipboard

Automated CSS selector translation to vector form

Open Immortalin opened this issue 10 years ago • 5 comments

Hi, is there a way to automatically translate standard css selectors to the vector form? I am using a browser plugin to automatically generate them en mass to input into a scraper and it is really tedious to have to translate them manually. Thanks in advance!

Immortalin avatar Jul 19 '15 05:07 Immortalin

Hi - I wrote this as a learning exercise, you're welcome to poke around at it and adapt if you like: https://gist.github.com/blx/e6970121d629ca2dbd49

It's pretty rough and doesn't support > selectors or commas yet, but works for :selectors and [attributes], plus the normal classes/IDs. Of course, you could probably use an actual css parser, but what would be the fun in that..

Example:

user=> (require '[eac])
nil
user=> eac/testcss
"p#x body.blue:not(.red) #lol:first-of-type input[type=text] span:nth-child(3n + 1 )"
user=> (eac/translate-css eac/testcss)
[:p#x [:body.blue (html/but :.red)] [:#lol html/first-of-type] [:input (attr= :type "text")] [:span (html/nth-child 3 1)]]

blx avatar Jul 23 '15 23:07 blx

@blx is https://gist.github.com/Jared314/5028617 a complete one?

On 24 July 2015 07:01:55 GMT+08:00, Ben Cook [email protected] wrote:

Hi - I wrote this as a learning exercise, you're welcome to poke around at it and adapt if you like: https://gist.github.com/blx/e6970121d629ca2dbd49

It's pretty rough and doesn't support > selectors or commas yet, but works for :selectors and [attributes], plus the normal classes/IDs. Of course, you could probably use an actual css parser, but what would be the fun in that..

Example:

user=> (require '[eac])
nil
user=> eac/testcss
"p#x body.blue:not(.red) #lol:first-of-type input[type=text]
span:nth-child(3n + 1 )"
user=> (eac/translate-css eac/testcss)
[:p#x [:body.blue (html/but :.red)] [:#lol html/first-of-type] [:input
(attr= :type "text")] [:span (html/nth-child 3 1)]]

Reply to this email directly or view it on GitHub: https://github.com/cgrand/enlive/issues/131#issuecomment-124261696

Immortalin avatar Jul 24 '15 01:07 Immortalin

@Immortalin That doesn't currently seem to support pseudoclasses or attributes:

user=> (gist/parse-css "div.blue:first-of-type input.red[type=text]")
[:div.blue :input.red]

but it looks like you could add formatters for those reasonably easily if necessary!

blx avatar Jul 24 '15 01:07 blx

@blx can you give me a example? I am rather new to Clojure.

Immortalin avatar Jul 26 '15 07:07 Immortalin

@cgrand is it possible to expose a function/api to allow usage of css selector strings directly via Jsoup? Something like this. This would be extremely helpful as there is currently no good automated solution for translating the css selectors generator by browser plugins such as Selector Gadget, furthermore, it would be more performant as Jsoup can be utilized to parse the css selector directly.

Immortalin avatar Sep 17 '15 14:09 Immortalin