Derek Su

Results 1074 comments of Derek Su

According to the analysis by @davidcheng0922, the issue is due to the flooding messages caused by detached volumes with `unknown` robustness. The issue is resolved in https://github.com/longhorn/longhorn/issues/10302.

Fixed in https://github.com/longhorn/longhorn-manager/commit/fdb4ee8d1b54db3365fc4b99d1f4ccdb424f3a1d.

> Had a discussion with @mantissahz and @derekbit, here is the summary: > > 1. The chunk (checksum) table can be a general solution: > > 1. It can be...

@mantissahz Remember to update the misleading titles of the PR and commit.

``` func="controller.(*Controller).monitoring" file="control.go:1283" 2025-01-22T04:00:12.199095670Z [backup-vol-e-0] time="2025-01-22T04:00:12Z" level=error msg="Error reading from wire 10.42.5.29:10011" func="dataconn.(*Client).read.func1" file="client.go:287" error=EOF 2025-01-22T04:00:12.201235113Z [backup-vol-e-0] time="2025-01-22T04:00:12Z" level=error msg="Error reading from wire 10.42.5.29:10011" func="dataconn.(*Client).read.func1" file="client.go:287" error=EOF 2025-01-22T04:00:12.326574413Z [backup-vol-e-0] time="2025-01-22T04:00:12Z"...

Should be salvaged? But we need to check what the test case does first. Is it easy to reproduce?

> > Is it easy to reproduce? > > I tried to reproduce and rerun it 16 times this morning but couldn't reproduce the issue. > > Reproduced in [ci.longhorn.io/job/public/job/v1.8.x/job/v1.8.x-longhorn-upgrade-tests-sles-arm64/44](https://ci.longhorn.io/job/public/job/v1.8.x/job/v1.8.x-longhorn-upgrade-tests-sles-arm64/44/)...

@mantissahz Can you help investigate the issue Thanks.

> Do we allow to take a snapshot when upgrading the engine I think yes. Any concerns

It is a rare issue, so let's handle it in v1.9.0 instead.