neofs-node icon indicating copy to clipboard operation
neofs-node copied to clipboard

404 errors are critical for big TTL values

Open carpawell opened this issue 11 months ago • 5 comments

Node does not like it: https://github.com/nspcc-dev/neofs-sdk-go/pull/562. It starts forwarding requests and receives them back.

Expected Behavior

TTL describes the max forwarding number.

Current Behavior

TTL describes how many times you may try to search for objects. If there is no such object at all, there are 8 request forwardings even if you are the initiator of the request forwarding, you receive it back cause you are the container's part and the forwarder wants you to try.

Possible Solution

Not sure. Mb turn forwarding off in the object services? Spawn only requests with TTL=2 manually? Track forwardings chain and do not continue if you notice a cycle?

Steps to Reproduce (for bugs)

Update to the provided SDK version and try to get non-existing object. Or delete it. See logs that are bigger than you are expecting.

Context

Regression

https://github.com/nspcc-dev/neofs-sdk-go/pull/562

carpawell avatar Mar 11 '24 21:03 carpawell

It's easy to revert, although this just means that our TTLs don't work the way they were intended to.

roman-khimov avatar Mar 12 '24 06:03 roman-khimov

Track forwardings chain and do not continue if you notice a cycle?

This defeats the purpose, although with additional signatures this can in fact substitute TTL.

roman-khimov avatar Mar 12 '24 07:03 roman-khimov

That is S0 to me. The commit is already merged and updating sometimes does not allow deleting big objects with timeout error. Requests spam does not allow proper work even after a single multiplied request on my laptop. That is also a discussion to me cause I am not sure how that should be solved.

carpawell avatar Mar 12 '24 07:03 carpawell

Let's do https://github.com/nspcc-dev/neofs-sdk-go/pull/567 and then think of associated problems, they can't be fixed quickly.

roman-khimov avatar Mar 12 '24 12:03 roman-khimov

OK, can be done this way. Not S0 then (but still a discussion?). A few links to the issue's solver: TTL is decoded here and then sent to the next node here (then repeat this one more time on the other node too). A lot of timeouts can be faced when running int tests, and a lot of debug logs about the same object being searched on every container node.

carpawell avatar Mar 12 '24 13:03 carpawell