pagser
pagser copied to clipboard
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
I think it could be simplified so that one could do: ```golang type PageData struct { Link string `pagser:"a->attr(href)"` Links []string `pagser:"a->attr(href)"` } ``` The package could inspect the type...
What should be the selector for this kind of html code: ```html Firstname John Lastname Doe ``` I would like to fill this struct: ```golang type Person struct { Firstname...
``` type Item struct { Title string `pagser:"td->eq(0)"` Image string `pagser:"td a img->attr(src)"` Quote string `pagser:"td->eq(3)"` Description string `pagser:"td->eq(4)"` } ``` If I put `td->eq(2)` in the tag for `Image`,...
I have HTML like: ``` ... ... ... ... ... ... ``` It seems I am unable to parse this into a struct like: ``` struct { Rows []struct {...
Some meta attributes can be capitalised, causing to have to duplicate them as below: ``` Author1 []string `pagser:"meta[name='author']->attrSplit(content)"` Author2 []string `pagser:"meta[name='Author']->attrSplit(content)"` Desc1 []string `pagser:"meta[name='description']->attrSplit(content)"` Desc2 []string `pagser:"meta[name='Description']->attrSplit(content)"` Keywords1 []string `pagser:"meta[name='keywords']->attrSplit(content)"`...
Currently it appears for me that for each non-trivial case, you might need a custom-function (or I miss something) I would have plenty of use-cases for method-chaining on builtin-functions e.g....