lnd icon indicating copy to clipboard operation
lnd copied to clipboard

localchans: recreate missing edge if not found

Open JssDWt opened this issue 1 year ago • 21 comments
trafficstars

Change Description

Description of change / link to associated issue.If a node contains a channel, but doesn't have a corresponding edge in the graph database, updating the channel policy would fail. In this commit the edge is recreated if the channel exists. This ensures a node can recover from a missing edge in the graph database by calling updatechanpolicy.

Alternative for https://github.com/lightningnetwork/lnd/pull/8768, namely option 2 in https://github.com/lightningnetwork/lnd/pull/8768#issuecomment-2143799767 Partially fixes https://github.com/lightningnetwork/lnd/issues/7261 by allowing to recreate the edge by calling updatechanpolicy.

Steps to Test

  • Create a node that has a channel with a missing edge
  • Calling getchaninfo on this channel will fail
  • Call updatechanpolicy on this channel
  • Calling getchaninfo on this channel should succeed

I'm not sure how to create an integration test where I can modify the graph database to delete an edge in order to test this. Please advise.

Pull Request Checklist

Testing

  • [x] Your PR passes all CI checks.
  • [x] Tests covering the positive and negative (error paths) are included.
  • [x] Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

📝 Please see our Contribution Guidelines for further guidance.

JssDWt avatar Jun 03 '24 10:06 JssDWt

[!IMPORTANT]

Review skipped

Auto reviews are limited to specific labels.

:label: Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai[bot] avatar Jun 03 '24 10:06 coderabbitai[bot]

Release notes need to be moved to the v0.18.1 file.

guggero avatar Jun 20 '24 09:06 guggero

Maybe it's related to this?

yyforyongyu avatar Jun 27 '24 15:06 yyforyongyu

Maybe it's related to https://github.com/lightningnetwork/lnd/issues/8870#issuecomment-2192464477?

Hmm I don't think so, not having the edge available means somehow that our own ChanAnnouncment didn't got through? Not receiving the node announcment would just leave us with a NodeAnnouncement Shell, we add during adding the edge to the db.

ziggie1984 avatar Jul 01 '24 10:07 ziggie1984

Concept Ack from my side as well, I think it's the best way to fix broken channels suffering from the Edge not Found problem. I think we still have not traced down the origin of the problem, this gives us at least the possibility to mitigate the problem.

ziggie1984 avatar Jul 01 '24 10:07 ziggie1984

Sorry for letting this sit for so long. Updated with your feedback now @yyforyongyu

JssDWt avatar Jul 02 '24 10:07 JssDWt

what about passing the graph db as a config to the localchan_manager, so that we can when recreating the edge just call AddChannelEdge and short-circuit all the other stuff which is done in the gossiper ? Later during the ChanUpdate (PropagateChanPolicyUpdate) all the necessary stuff is done (adding the update to the Topology etc.).

It's a mitigation to a unknown bug anyways which we will find as soon as we improved logging in the gossiper.

ziggie1984 avatar Jul 02 '24 15:07 ziggie1984

Handled most comments. Still have to create a test.

JssDWt avatar Jul 02 '24 19:07 JssDWt

Added a test for createEdge

JssDWt avatar Jul 04 '24 12:07 JssDWt

Now also added a test to the UpdatePolicy function

JssDWt avatar Jul 04 '24 12:07 JssDWt

Handled Ziggie's comments.

If this gets a thumbs up, we'll try it out in production to see whether it helps with our edge not found issues.

JssDWt avatar Jul 04 '24 16:07 JssDWt

Awesome job @JssDWt 🎉, thanks for fixing this in LND. Just a little style nit.

I will approve this PR once you give us feedback how your local testing went but it's gtg from my side.

ziggie1984 avatar Jul 04 '24 21:07 ziggie1984

@JssDWt have you tested already, because we are planning to ship it in 18.3 ?

ziggie1984 avatar Jul 22 '24 14:07 ziggie1984

@ziggie1984 you can't ship it with 0.18.3. There was a crash. I'm afk for the next week, I'll look at it next week.

JssDWt avatar Jul 22 '24 14:07 JssDWt

@ziggie1984 you can't ship it with 0.18.3. There was a crash. I'm afk for the next week, I'll look at it next week.

Moved to 0.19

saubyk avatar Jul 22 '24 15:07 saubyk

Added a fix for a crash, because SchedulerOptions were nil.

JssDWt avatar Aug 02 '24 09:08 JssDWt

Rebased

JssDWt avatar Aug 02 '24 10:08 JssDWt

Just for general information this was the problem:

AddEdge: func(edge *models.ChannelEdgeInfo) error {
			return s.chanRouter.AddEdge(edge, nil)}

by accident we provided a nil for the SchedulerOptions

now it's fixed, we don't test db interaction in the unit-test case.

AddEdge: func(edge *models.ChannelEdgeInfo) error {
			return s.graphBuilder.AddEdge(edge)
		},

However I think because this happened lets add a nil checke here:

for _, f := range op {
         if f != nil{
		f(r)
         }
	                        }

better safe than sorry.

ziggie1984 avatar Aug 02 '24 10:08 ziggie1984

Noting that during the rebase I had to replace chanRouter with graphBuilder here.

JssDWt avatar Aug 02 '24 10:08 JssDWt

However I think because this happened lets add a nil checke here:

for _, f := range op {
         if f != nil{
		f(r)
         }
	                        }

better safe than sorry.

Added this in this PR

JssDWt avatar Aug 02 '24 12:08 JssDWt

Fixed a bug where the timestamp was not set, so the policy wouldn't be added.

And rebased.

JssDWt avatar Oct 29 '24 22:10 JssDWt

Added a commit that ensures the policy is also recreated if the edge exists, but only not the policy.

JssDWt avatar Oct 31 '24 12:10 JssDWt

Looking good, waiting your final test and hope we can get this thing merged then.

ziggie1984 avatar Oct 31 '24 13:10 ziggie1984

Will approve as soon as @JssDWt reports back that this actually fixes the problem of missing edges, so it is only on hold because we cannot easily reproduce missing edgeinfo in a db testing it in an itest.

ziggie1984 avatar Nov 05 '24 07:11 ziggie1984

Will approve as soon as @JssDWt reports back that this actually fixes the problem of missing edges, so it is only on hold because we cannot easily reproduce missing edgeinfo in a db testing it in an itest.

Ah, sorry. Wasn't sure if the latest push contained those (probably should've checked :sweat_smile:).

guggero avatar Nov 05 '24 08:11 guggero

@jssdwt, remember to re-request review from reviewers when ready

lightninglabs-deploy avatar Nov 12 '24 09:11 lightninglabs-deploy

Added comments that the ForAllOutgoingChannels function may invoke the callback with a nil channel policy. Also rebased.

JssDWt avatar Nov 13 '24 11:11 JssDWt

Ok let's not change too much right now let's do the following:

  1. We cannot just log a warning in the edge=nil case, with the combination of createMissingEdge we would attempt in the worst case scenario to readd an edge => so let's return an error here, because that should never be the case.

  2. In updateEdge => use a NON-Pointer for the *models.ChannelEdgePolicy

  3. remove this form updateEdge: if edge == nil { _, edge, err = r.createEdge(channel, time.Now()) if err != nil { return nil, err } }

We now check the policy so I don't think there is a need for the above.

Then we should be good to go

ziggie1984 avatar Nov 18 '24 15:11 ziggie1984

I undid the tryout commit.

  1. In updateEdge => use a NON-Pointer for the *models.ChannelEdgePolicy

This we can't do, because that wouldn't update the edge outside the updateEdge function.

  1. We cannot just log a warning in the edge=nil case, with the combination of createMissingEdge we would attempt in the worst case scenario to readd an edge => so let's return an error here, because that should never be the case.
  2. remove this form updateEdge: if edge == nil { _, edge, err = r.createEdge(channel, time.Now()) if err != nil { return nil, err } }

So this basically means revert commit 7d9d100e94cab9fc5adce139fe2a9a8b3dd77538 correct?

That would also address this comment: https://github.com/lightningnetwork/lnd/pull/8805#discussion_r1845605281

JssDWt avatar Nov 18 '24 17:11 JssDWt

This we can't do, because that wouldn't update the edge outside the updateEdge function.

I am not sure I understand, I am talking about the func (r *Manager) updateEdge(tx kvdb.RTx, chanPoint wire.OutPoint, function, which is only used 2 times and also returns the specified edge so we can use the return value here, as we currently do anyways ?

ziggie1984 avatar Nov 18 '24 20:11 ziggie1984