html-table-extractor issues

Fix for passing None as id skips tables that have id

2

Drop the support for python2

I have encountered multiple bugs due to supporting backward compatibility of python2, and it makes development extremely hard since I have to hack the code to make it work for...

yuanxu-li

Encoding

1

Gives and error stating 'ascii' codec can't encode character u'\xe9'

MeenakshiChorge

Extract as a list of dictionaries

Would be a good feature to extract the the rows as a dict, at least from simple tables. Example: | A | B | C | | 1 | 2...

lucasa

More granular parsing for some complex tables

5

I am not sure if this is really an issue with the parser but perhaps an improvement request unless a solution is available when using this parser. Consider a complex...

dr333

your package works great but I had to modify it slightly.. ``` self._insert(row_ind, col_ind, row_span, col_span, self._transformer(cell.get_text())) ``` This is fine if the content is text but if it contains...

hampsterx

cell formatting

Hi, it would be nice if you supported the parameters of the beautifulsoup get_text method, namely 'separator' and 'strip'. See the BS docs here - https://beautiful-soup-4.readthedocs.io/en/latest/index.html?highlight=get_text#get-text These could be added...

samadhicsec

html-table-extractor
html-table-extractor copied to clipboard

Metadata

Fix for passing None as id skips tables that have id

Drop the support for python2

Encoding

Extract as a list of dictionaries

More granular parsing for some complex tables

cell extraction

cell formatting

← Metadata

Owner

Metadata

html-table-extractor html-table-extractor copied to clipboard

Metadata

Fix for passing None as id skips tables that have id

Drop the support for python2

Encoding

Extract as a list of dictionaries

More granular parsing for some complex tables

cell extraction

cell formatting

← Metadata

Owner

Metadata

html-table-extractor
html-table-extractor copied to clipboard