fabric
fabric copied to clipboard
HashMap in WordCount
Hi Idris/Sathish
How your wordCount HashMap in the sample code is going to be behave in clustered environment?
Br Harvinder
Hi Harvinder, Word count example in the samples is just a toy example. For running in clustered mode, you need to use a partitioned source. For example, publish sentences to Kafka topic partitioned based on sentence and use KafkaSource instead of RandomSentenceSource in the computation. All the components (source and processors) run within a single JVM only, so the entire computation is scaled horizontally by spawning one more process. The KafkaSource is intelligent enough to balance the partitions between multiple instances.
My question was about the target HashMap..Anyways, what you are saying that this hashmap is going to be a KafkaSink in a distributed environment
Sure, you can rewrite the WordCounter processor to use a distributed cache like Redis or Hazelcast to maintain counts in cluster mode.