heritrix3
heritrix3 copied to clipboard
How to scale Heritrix with Kubernetes?
i would like to scale heritrix with kubernetes, but it appears that it is more complicated.
i am scheduling crawl via heritrix REST API, but actions like
- build
- launch
- unpause
seems to directly altering the pod state making it impossible to scale to multiple instances running different crawl tasks.
for example if the pods p1, p2, p3 are running and i send build, launch, unpause rest api requests to heritrix its not guaranteed to be sent to same pod as there are multiple instances running.
whats the correct way to scale heritrix with kubernetes?