LLPhant icon indicating copy to clipboard operation
LLPhant copied to clipboard

Support HTML Parsing

Open amenk opened this issue 1 year ago • 5 comments

This adds support for HTML parsing.

amenk avatar Jun 14 '24 15:06 amenk

Great idea @amenk

MaximeThoonsen avatar Jun 14 '24 17:06 MaximeThoonsen

@amenk You can try to fix formatting problems with: composer fix-lint Try also

composer refactor && composer test

f-lombardo avatar Jun 29 '24 14:06 f-lombardo

@f-lombardo thanks, currently not working on this

Will hopefully pick up later.

amenk avatar Jul 17 '24 11:07 amenk

@amenk @f-lombardo what about just use WebPageTextGetter class in Tool? To get the text from html should be quite enough no? (even if quite messy ^^)

MaximeThoonsen avatar Oct 11 '24 15:10 MaximeThoonsen

@amenk @f-lombardo what about just use WebPageTextGetter class in Tool? To get the text from html should be quite enough no? (even if quite messy ^^)

Well, it's an option, even if whe should change it a bit in order to parse also HTML coming from a file. Another thing to consider is if LLPhant should handle the parsing of complex HTML pages by itself or it has to delegate that to an external specialized library.

f-lombardo avatar Oct 11 '24 16:10 f-lombardo