link_thumbnailer icon indicating copy to clipboard operation
link_thumbnailer copied to clipboard

How I can pass a HTML string to LinkThumbnailer

Open vedmant opened this issue 6 years ago • 8 comments

How I can pass a HTML string to LinkThumbnailer to make it parse it instead of opening URL? Couldn't find anything about it in the documentation or issues.

vedmant avatar Jun 28 '19 17:06 vedmant

There is no public method available for this use case. However, you can try calling the following directly code directly:

https://github.com/gottfrois/link_thumbnailer/blob/master/lib/link_thumbnailer/scraper.rb#L25

source = "<html>...</html>"
scraper = LinkThumbnailer::Scraper.new(source, "http://fake.come")
scraper.call

Note that you will need to pass a URL anyway because LinkThumbnailer expects one, but it's only to prefill the URL attribute of the response object.

gottfrois avatar Jul 01 '19 07:07 gottfrois

@gottfrois Thanks, I'll try this. Would be nice to have public method for this, in my case I need to scrap html for other data also and currently I didn't find other option but to make two requests one from LinkThumbnailer another to get html content for my needs.

vedmant avatar Jul 01 '19 08:07 vedmant

@gottfrois I tried your suggestion but have an error: undefined method 'host' for #String:0x00007f9fcd043e80 on line scraper.call

vedmant avatar Jul 02 '19 17:07 vedmant

Ah, the URL should be a URI object i think, not a string https://docs.ruby-lang.org/en/2.1.0/URI.html

gottfrois avatar Jul 03 '19 09:07 gottfrois

@gottfrois Thanks, that worked. However I have another issue now, I moved the code into a service class, and it stopped working, I have following error undefined method 'config' for nil:NilClass from /usr/local/bundle/gems/link_thumbnailer-3.3.2/lib/link_thumbnailer/scraper.rb:28:in 'initialize' There is a line in gem: @config = ::LinkThumbnailer.page.config and looks like LinkThumbnailer.page is nil.

vedmant avatar Jul 22 '19 14:07 vedmant

Yeah this code is not the best... :( https://github.com/gottfrois/link_thumbnailer/blob/master/lib/link_thumbnailer/scraper.rb#L28

Don't remember why i proxied config through the Page class but it should use LinkThumbnailer.config directly IMO.

I don't see a quick fix for this other than monkey patching or changing the code of the gem and make it easier to scrap html directly.

gottfrois avatar Jul 23 '19 08:07 gottfrois

Are there plans for new release with direct html parsing support? I'd use that.

vedmant avatar Mar 18 '21 10:03 vedmant

No plans really but I'd gladly accept a PR

gottfrois avatar Jun 14 '21 07:06 gottfrois