javascript ECONNRESET error from kubernetes watch after some minutes.

Describe the bug If I use the kubernetes watch to listen to resource changes I get ECONNRESET and the watch stops. Is there any chance that the watch can handle underlaying connections errors and restart on his own?

** Client Version ** 0.20.0

To Reproduce Steps to reproduce the behavior:

start a watch and wait longer than the setTimeout or setKeepAlive setting in the Watch config.

Expected behavior A watch runs without connection issues.

** Example Code**

function waitForPodCompletion(log: Context['log'], k8sConfig: KubeConfig, podNamespace: string, resourceVersion?: string, jobName?: string): Promise<V1Pod> {
 let lastResourceVersion = resourceVersion;
  return new Promise<V1Pod>((resolve, reject) => {
    const watch = new Watch(k8sConfig);
    const queryParams: { labelSelector: string, resourceVersion?: string } = { labelSelector: `job-name=${jobName}` };
    if (resourceVersion) {
      queryParams.resourceVersion = resourceVersion;
    }

    watch.watch(`/api/v1/namespaces/${podNamespace}/pods`, queryParams, (eventType, pod: V1Pod) => {
      lastResourceVersion = pod.metadata?.resourceVersion;
      // log.info("WATCH RESULT" + JSON.stringify(pod));
      if (eventType === 'ADDED' && pod.metadata?.name) {
        log.info(`Job pod ${pod.metadata.name} ${pod.metadata?.resourceVersion} added.`);
      }
      if (eventType === 'MODIFIED' && pod.metadata?.name) {
        log.info(`Job pod ${pod.metadata.name} status: ${pod.status?.phase}, resourceVersion: ${pod.metadata?.resourceVersion}.`);
        if (pod.status?.phase === 'Succeeded') {
          //log.info("WATCH RESULT" + JSON.stringify(pod));
          resolve(pod);
        } else if (pod.status?.phase === 'Failed') {
          reject(new Error(`Job failed. Pod ${pod.metadata.name} status: ${pod.status.phase} startTime: ${pod.status.startTime}.`));
        }
      }
    }, (error: { code: string, message: string, stack: string }) => {
      // strange, here I get "null" call, short after the ECONNRESET
      if (error){
        reject(error);
      }
    })
  }).catch(onrejected => {
    if (onrejected && onrejected.code == 'ECONNRESET') {
      log.info(`Restart Watch with ${lastResourceVersion}.`);
      return waitForPodCompletion(log, k8sConfig, podNamespace, lastResourceVersion, jobName);
    } else {
      throw onrejected;
    }
  })

Environment (please complete the following information):

OS: Windows
NodeJS Versionv20.10.0
Cloud runtime Redhat OpenShift

Jan 02 '24 10:01 jimjaeger

A watch is tied to a single TCP stream, so when it is broken you need to start a new watch (and you need to re-list also in case you missed something)

The informer class encapsulates this logic and is probably what you are looking for: https://github.com/kubernetes-client/javascript/blob/master/src/informer.ts

(fwiw, wrt the "informer" name, I think it's confusing, but it got established as the standard name within the go client library, so we use it here too for consistency.)

Jan 02 '24 17:01 brendandburns

Thanks for the information. But the informer class has the same problem. The informer also throws the inner connection errors.

Jan 02 '24 17:01 jimjaeger

Same issue here with informer. Tried workaround of periodically starting the informer as suggested in https://github.com/kubernetes-client/javascript/issues/596. Nonetheless, a new issue was hit (see https://github.com/kubernetes-client/javascript/issues/1598)

Mar 06 '24 04:03 jobcespedes

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 04 '24 04:06 k8s-triage-robot

/remove-lifecycle stale

Jun 04 '24 16:06 jimjaeger

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Sep 02 '24 16:09 k8s-triage-robot

/remove-lifecycle stale

Sep 06 '24 16:09 jimjaeger

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Dec 05 '24 17:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jan 04 '25 18:01 k8s-triage-robot

/remove-lifecycle rotten

Jan 04 '25 21:01 jimjaeger

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Apr 04 '25 22:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

May 04 '25 22:05 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Jun 03 '25 23:06 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Jun 03 '25 23:06 k8s-ci-robot