visidata icon indicating copy to clipboard operation
visidata copied to clipboard

[html] Opening up an html table with interspersed headers

Open anjakefala opened this issue 2 years ago • 2 comments

<Notkea> hello, I'm trying to extract data from an HTML table which does not have a header row. I end up with a single column and many empty rows (containing NoneType objects). Any hint of how I could get the data in the cells? The document looks like this: vd "https://webshop.calestor-periway.fr/product/Moniteurs-TV/Moniteurs/Samsung/Samsung-C49J890DKR-cran-LED-incurv-49-?searchtrack=ProductList&prodid=1437755&info=2"

anjakefala avatar Apr 05 '22 19:04 anjakefala

--header 0 does not seem to help. Opening an issue to investigate for when we have more focused time. Question was originally asked on #visidata.

anjakefala avatar Apr 05 '22 19:04 anjakefala

The table structure is <tr> alternately containing <th>/<td> tags. The html loader will have to do something different in this particular case.

saulpw avatar Jan 07 '23 21:01 saulpw