Julien Nioche

Results 69 issues of Julien Nioche

I am trying to index documents using the Vespa HTTP client. One thing I need to do is to create some of these documents only if they don't already exist....

enhancement

[FEATURE REQUEST] Copied from [https://stackoverflow.com/q/56445656/432844](https://stackoverflow.com/q/56445656/432844) EXIFTool can detect when the offset of tags is incorrect [ExifTool] Warning : [minor] Possibly incorrect maker notes offsets (fix by 1060?) Can we detect...

help wanted
format-maker-notes
image-queue

Hi, This is a question, not a bug report. [url-frontier](https://github.com/crawler-commons/url-frontier) is an API to define a [crawl frontier](https://en.wikipedia.org/wiki/Crawl_frontier). It uses gRPC and has a service implementation. It is crawler-neutral and...

Hi, This is a question, not a bug report. [url-frontier](https://github.com/crawler-commons/url-frontier) is an API to define a [crawl frontier](https://en.wikipedia.org/wiki/Crawl_frontier). It uses gRPC and has a service implementation. It is crawler-neutral and...

enhancement

From a user `Links that were once pages and then turn to redirects are our issue. Our content management system auto creates clean URLs. If the title of the page...

core

``` 022-07-15 09:57:16.851 o.a.s.e.e.ReportError Thread-43-fetcher-executor[15, 15] [ERROR] Error java.lang.RuntimeException: java.lang.RuntimeException: java.util.ConcurrentModificationException at org.apache.storm.utils.Utils$1.run(Utils.java:411) ~[storm-client-2.4.0.jar:2.4.0] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: java.lang.RuntimeException: java.util.ConcurrentModificationException at org.apache.storm.executor.Executor.accept(Executor.java:301) ~[storm-client-2.4.0.jar:2.4.0] at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) ~[storm-client-2.4.0.jar:2.4.0] at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) ~[storm-client-2.4.0.jar:2.4.0]...

bug

Maybe https://github.com/inoio/solrs would be useful?

enhancement
SOLR
help wanted

Just like it's done in ES, we could route the documents in the statusupdaterbolt based on the host / name or IP and in the spouts check that the number...

enhancement
SOLR
help wanted
good first issue

https://www.elastic.co/blog/aggregate-data-faster-with-new-the-random-sampler-aggregation

elasticsearch