sparkler icon indicating copy to clipboard operation
sparkler copied to clipboard

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Results 56 sparkler issues
Sort by recently updated
recently updated
newest added

Bumps [tmpl](https://github.com/daaku/nodejs-tmpl) from 1.0.4 to 1.0.5. Commits See full diff in compare view [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tmpl&package-manager=npm_and_yarn&previous-version=1.0.4&new-version=1.0.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter...

dependencies
javascript

Bumps [jetty-server](https://github.com/eclipse/jetty.project) from 9.4.0.v20161208 to 9.4.41.v20210516. Release notes Sourced from jetty-server's releases. 9.4.41.v20210516 Changelog This release resolves CVE-2021-28169 #6099 Cipher preference may break SNI if certificates have different key types...

dependencies
java

Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.4.4 to 1.5.10. Commits 8cd4c6c 1.5.10 ce7a01f [fix] Improve handling of empty port 0071490 [doc] Update JSDoc comment a7044e3 [minor] Use more descriptive variable name d547792 [security]...

dependencies
javascript

Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.14.7 to 1.14.9. Commits 13136e9 Release version 1.14.9 of the npm package. 2ec9b0b Keep headers when upgrading from HTTP to HTTPS. 5fc74dd Reduce nesting. 3d81dc3 Release version...

dependencies
javascript

#### Issue Description I am trying to build and run the sparkler from the source. I am following the example given in the readme. I have injected a url and...

#### Issue Description Build fails ``` [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary for sparkler-parent 0.2.2-SNAPSHOT: [INFO] [INFO] sparkler-parent .................................... SUCCESS [ 0.003 s] [INFO] sparkler-tests-base ................................ SUCCESS [ 1.374 s] [INFO]...

I dunno if there is anything obvious that springs to mind here @thammegowda or @karanjeets from back in the day. When I run Sparkler as a spark submit job on...

**Task Description** Most of the Elasticsearch implementation has already been written. There are still two major problems that need to be resolved: 1. [ElasticsearchResultIterator](https://github.com/felixloesing/sparkler/blob/8aad32886b223bd89ae9a3a27aa883bfdb730a2b/sparkler-core/sparkler-app/src/main/scala/edu/usc/irds/sparkler/storage/elasticsearch/ElasticsearchResultIterator.scala#L103) needs to implement deserialize(). We are...

bug

#### Issue Description We are implementing unit tests to test general functionalities of Sparkler and later our connector to Elasticsearch. We will populate details as we write the tests.

#### Issue Description Please describe our issue, along with: - expected behavior - encountered behavior The crawler crashes unexpectedly after a while, claiming that resource limits have been reached. ####...