vespa
vespa copied to clipboard
HTTP proxy to access nodes within the cluster
Is your feature request related to a problem? Please describe.
When running a Vespa cluster on an internal network, it is difficult to access node specific HTTP APIs.
For example, consider the following configuration, where a Vespa cluster is run on an internal network with load balancers setup to access the config nodes (for deploying applications) and container nodes (for feeding and querying):
To access an API on a specific node, like the Custom Component State API, one would need to expose the node/port for each content node outside of the internal network, or create a proxy of some sort. Note that a load balancer wouldn't work well here because the data on each content node needs to be explored independently.
Describe the solution you'd like
It would be useful to have an HTTP proxy running as part of the Vespa cluster to easily access the various APIs from a single point. For example, this could run on the config nodes and could use the host aliases to identify the hosts to proxy.
With this host in hosts.xml
:
<host name="vespa-content-0.vespa.cluster.local">
<alias>content-0</alias>
</host>
One could proxy a request through the config node to the custom component state API as follows:
curl https://vespa-config-lb:19071/proxy/v1/content-0:19107/state/v1/custom/component/
-
/proxy/v1
here is a new config API which does the proxying -
content-0:19107/state/v1/custom/component/
is the<host>:<port>/<endpoint>
where the request should be sent
This is just an example of how it could work.
Describe alternatives you've considered
The user could create a proxy manually or expose the ports outside of the internal network.
Additional context
This would be incredibly useful for Vispana which is a web UI to view the status of a Vespa cluster. With this proxy in place, it would be easy to add support for Vespa clusters deployed on an internal network. The problem now is that when discovering hosts on an internal network from the Vespa configuration, only the internal hostnames are returned. If Vispana is running outside of the network, these hostnames are useless and there is no easy way to query individual nodes. If the proxy is implemented, Vispana can use it to query e.g. the custom component state API and expose a simple exploratory view of the data.