graph-node Removed alphanumeric enforcement for node IDs

This removes the node ID check requiring that a node identifier be solely composed of A through Z, 0 through 0, or an underscore (_) character. My reasoning for removing this restriction is that it enables easier node provisioning by enabling the use of generated Kubernetes host names to name index and query nodes. This also removes confusion when a new indexer deploys the indexer stack.

Jun 23 '22 22:06 kaiwetlesen

CC: @chriswessels

Jun 23 '22 22:06 kaiwetlesen

My reasoning for removing this restriction is that it enables easier node provisioning by enabling the use of generated Kubernetes host names to name index and query nodes. This also removes confusion when a new indexer deploys the indexer stack.

Can you explain in more detail the problems this causes? Can that be avoided with better error messages? Since the change also renames nodes (from graph-node's perspective) we'll need some strategy how to avoid breaking people's existing setups. For example, this change as-is would make everything in the hosted service not index anything.

Jun 24 '22 00:06 lutter

Sure thing. Typically a new indexer will assign the hostname to the node ID parameter for simplicity and clarity within the cluster. Existing indexers (myself included), will use the same hostname for identifying additional index nodes. The symbol transform from - to _ gets done transparently in the startup script (removed as seen in the diff) and without notice to the user. Furthermore, this transformation and character requirement is not documented anywhere, except for this removed code snippet. This causes confusion and significantly hinders usability.

To address your second point in conceiving a strategy to preserve legacy behaviour, it is certainly possible to add a flag to either enable opt in to the new behaviour disabling the silent transform, or opt in to enabling the legacy behaviour. What are your thoughts @lutter?

Jun 24 '22 01:06 kaiwetlesen

I'd like to add further reasoning why this case enforcement should be removed. When bootstrapping the node with the start script in the docker folder, it checks $node_id then sets $DISABLE_BLOCK_INGESTOR according to a comparison between $node_id and what value is set for $BLOCK_INGESTOR. If node ID is set to the host name provided by K8S, then the block ingestor is silently disabled. If one is to assume that the $BLOCK_INGESTOR follows the enforced naming convention, then this silent disablement will always occur. This degrades the user experience significantly.

Jun 27 '22 19:06 kaiwetlesen

Thank you for submitting @kaiwetlesen! Would very much like to see this merged. The existing silent behaviour doesn't play well with Kubernetes, and was a significant gotcha when I first started on the stack, and even once you know about it, the mismatch between names in infrastructure land (e.g. Kubernetes) and graph-node land is an unnecessary inconvenience. No doubt this is true for new Indexers too.

Jun 28 '22 08:06 chriswessels

Update: to address any potential backwards compatibility issues until user data can be updated, I've added an environment variable to opt into the new behavior.

Jun 28 '22 16:06 kaiwetlesen

@lutter any way we can work this into the next release?

Jul 30 '22 22:07 kaiwetlesen

Hey @kaiwetlesen - thanks for adding the environment variable as a config. Are there any other edge cases or considerations that we should be aware of here?

Aug 03 '22 12:08 azf20

Not really that I can think of @azf20. As a matter of fact there should be no issues once an indexer reassigns their deployments. Issues should just go away.

Aug 19 '22 01:08 kaiwetlesen

(rebased this branch onto current master)

Aug 19 '22 23:08 kaiwetlesen

graph-node graph-node copied to clipboard

Removed alphanumeric enforcement for node IDs

graph-node
graph-node copied to clipboard