snacktory icon indicating copy to clipboard operation
snacktory copied to clipboard

Readability clone in Java

Results 23 snacktory issues
Sort by recently updated
recently updated
newest added

Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1. Release notes Sourced from junit's releases. JUnit 4.13.1 Please refer to the release notes for details. JUnit 4.13 Please refer to the release notes...

dependencies

I publish a new method in HtmlFetcher called extract which has a new parameter (content) to pass byte[] content from url. I would like to avoid downloading the url´s content....

Here is the example: https://www.nytimes.com/2017/10/09/business/general-motors-driverless.html Text not fully parsed from the beginning. It starts only from: ``` The efforts have been moving forward in earnest since early last year, when...

Here is the example: https://www.cnbc.com/2017/10/09/amazons-comedies-win-with-critics-while-hulu-is-a-hit-with-audiences.html https://www.cnbc.com/2017/10/10/opec-calls-on-us-shale-oil-producers-to-accept-shared-responsibility.html Text not fully parsed. Only first part of article.

Not able to extract content from the some websites like quora.com and possibly some others. It is returning 403, for HEAD request method at [this line](https://github.com/karussell/snacktory/blob/master/src/main/java/de/jetwick/snacktory/HtmlFetcher.java#L360) in HtmlFetcher class.

Hi @karussell, thanks for building and sharing Snacktory! You said you were [looking for someone](869dc14c28c0c33dac07acfd244530c54ccb7473) to take over maintenance and future development? We’ve been working hard on our own fork,...

protected String detectCharset(String key, ByteArrayOutputStream bos, BufferedInputStream in, String enc) throws IOException { byte[] arr = new byte[2048]; how to reproduce: do a fetchAndExtract of this url 'http://www.gazzetta.it/Sport-Invernali/Sci-Alpino/Coppa-Mondo-Sci/26-02-2017/sci-combinata-brignone-ho-sciato-senza-paura-uscire-180995893986.shtml'

Very occasionally I'm getting a stack overflow in 1.3-SNAPSHOT- so clearly it is content specific. Sadly I haven't been able to capture an offending site yet: java.lang.StackOverflowError at java.util.LinkedHashMap.afterNodeInsertion(LinkedHashMap.java:299) at...

Hello, I am getting an exception when loading urls with pages larger than the fixed `500000 maxBytes` limit specified in `Converter` class. Please add a way to either modify this...

Did you manage to add the dependency with sbt? I do get different exceptions while referring to different versions