headscale icon indicating copy to clipboard operation
headscale copied to clipboard

[Feature] More context for "node not found" error

Open ChibangLW opened this issue 9 months ago • 4 comments

Use case

While digging through the logs I noticed a lot of ERR user msg: node not found code=404 log entries. With some different log settings (level: trace, database.debug: true, database.gorm.parameterized_queries: false, database.gorm.skip_err_record_not_found: false) I narrowed it down to a node with a valid nodekey but not in the database. But still no further information about what node it could be.

Description

It would be beneficial to have some more context in the error message.

Contribution

  • [ ] I can write the design doc for this feature
  • [x] I can contribute this feature

How can it be implemented?

If not mistaken the error originates in the noise handler. Depending on the log level maybe some more information could be put into the error message or as a seperate trace entry, e.g. dumping the HostInfo object which should be also included in the MapRequest.

Is this something you would accept?

ChibangLW avatar Feb 10 '25 20:02 ChibangLW

I get the same error ERR user msg: node not found code=404 in the log. Installed in a docker container. Error came up after upgrade form version 0.23 directly to version 0.25. How can I fix the cause for the logged error?

ffuhrnew avatar Feb 13 '25 20:02 ffuhrnew

I get the same error ERR user msg: node not found code=404 in the log. Installed in a docker container. Error came up after upgrade form version 0.23 directly to version 0.25. How can I fix the cause for the logged error?

It looks like a node is talking with headscale but is no longer part of the database.

This message can be easily trigged with headscale nodes delete -i ID for a node that is currently connected.

nblock avatar Feb 14 '25 06:02 nblock

The Code that brings up the error message seems to be located here: https://github.com/juanfont/headscale/blob/b3fa16fbdaf47fe3854b4b306dc6fe7a3d7fbd10/hscontrol/noise.go#L208 From here on i get no further ...

ns.nodeKey = mapRequest.NodeKey

node, err := ns.headscale.db.GetNodeByNodeKey(mapRequest.NodeKey)
if err != nil {
	if errors.Is(err, gorm.ErrRecordNotFound) {
		httpError(writer, NewHTTPError(http.StatusNotFound, "node not found", nil))
		return
	}
	httpError(writer, err)
	return
}

ffuhrnew avatar Feb 14 '25 06:02 ffuhrnew

I do not think you should read into this as an error. It essentially just means that a node that isnt register is trying to connect. It is equivalent to a user that is not in the database trying to log in (at least in europe where that is a 404).

Should we log it? hard to say, it might be confusing as an error, but it is equivalent to failed login attempts which might be interesting.

kradalby avatar Feb 21 '25 13:02 kradalby

This issue is stale because it has been open for 90 days with no activity.

github-actions[bot] avatar May 23 '25 02:05 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar May 31 '25 02:05 github-actions[bot]

I agree this should be more detailed.

paralin avatar Oct 14 '25 06:10 paralin

I noticed this issue while doing large scale device tests (starting 600+ tailscale clients in one go) while determining if tailscale is suitable for our usecase.

It does make it difficult to debug what the actual problem is.

As an aside, I feel like Github Actions closing the issue as "Not planned" because it went stale is not ideal?

rittycat avatar Nov 20 '25 08:11 rittycat