clj-tagsoup
clj-tagsoup copied to clipboard
Add usage examples
Once the html is parsed, how can most efficiently query the parsed document? That is, I would want to be able to drill down as if it were a map:
(get-in x [:html :head :title])
It would be great if you added some recommendations how to do that transformation (for example https://github.com/cjohansen/hiccup-find looks promising).
Ditto this. As a clojure noob, this vector thing confuses the hell out of me
Thanks for chiming in!
A quick-and-dirty solution could be something along the lines of (untested, might be buggy):
(defn get-in-html [tree [tag & tags]]
(if tag
(when tree
(recur (first (filter #(= (first %) tag) (rest tree))) tags))
tree))
Note that you'd want to call it as (get-in x [:head :title])
, bypassing the :html
.
This is very simplistic and only supports seqs of tags. If you want to extract arbitrary subtrees, you may want to take a look at Enlive. (When I have free time, I intend to explore the possibility of integrating clj-tagsoup and Enlive, as I feel that both projects might benefit from this.)