Mike Pedersen

Results 10 comments of Mike Pedersen
trafficstars

Same problem for the zookeeper image. I think the issue is that the config templates haven't been updated in ~4 years and these features are newer than that.

For zookeeper the template probably just needs to be updated: https://github.com/confluentinc/cp-docker-images/blob/5.3.3-post/debian/zookeeper/include/etc/confluent/docker/zookeeper.properties.template But for Kafka, it seems like it is converting anything starting with `KAFKA_`: https://github.com/confluentinc/cp-docker-images/blob/5.3.3-post/debian/kafka/include/etc/confluent/docker/kafka.properties.template You sure you can't just...

> We actually do have an `EntityStreamSizeException` - I wonder why we don't throw that one in `toStrict` instead of a generic `EntityStreamException`. Yeah, I was wondering about that as...

I've created the linked PR to fix the issue, although adding that error handling also has the side effect of matching other entity failures.

>Could you say more about how the use of a bloom filter avoids the large file problem? ... *That* problem persists. It avoids the problem of large files making lots...

>It seems to me like it almost requires building up a bunch of chunks of the corpus, where each chunk has a bloom filter (or similar) index. I don't really...

>That sounds exactly like building a bunch of chunks of the corpus, where each chunk is a single file. Okay, but I don't see the difference vs. an inverted index....

>But the bloom filter chunking mechanism you're proposing requires visiting every chunk. Well, yes, unless you have a hierarchical filter. But a hierarchical filter using eg. the file hierarchy wouldn't...

>ripgrep's index system will most closely resemble Russ Cox's `codesearch`. That is, at its core, it is an inverted index with ngrams as its terms. You should consider using Bloom...

Perhaps `add_sorted_edges` could be an alternative. Since both lists are sorted, it would be possible to merge in O(n+m). If this seems like a good idea, I could probably take...