k8s.io icon indicating copy to clipboard operation
k8s.io copied to clipboard

Where to host cs.k8s.io

Open ameukam opened this issue 4 years ago • 50 comments

https://cs.k8s.io is running on a baremetal server provided by Equinix Metal(ex Packet) under CNCF budget and operated until now by @dims.

The question was asked about whether or not we should host CodeSearch on aaa cluster.

Ref: https://kubernetes.slack.com/archives/CCK68P2Q2/p1615204807111900?thread_ts=1615189697.108500&cid=CCK68P2Q2

Issue open to track the discussions and the consensus about this.

ameukam avatar Mar 09 '21 14:03 ameukam

@dims where is the original source code for cs.k8s.io? :eyes:

nikhita avatar Mar 09 '21 14:03 nikhita

/wg k8s-infra

nikhita avatar Mar 09 '21 14:03 nikhita

/sig contributor-experience /priority backlog

/assign @spiffxp cc @mrbobbytables @alisondy @cblecker @munnerz

ameukam avatar Mar 09 '21 14:03 ameukam

@dims where is the original source code for cs.k8s.io? eyes

@nikhita You can find the config here https://github.com/dims/k8s-code.appspot.com/

ameukam avatar Mar 09 '21 14:03 ameukam

What's the argument against hosting it on AAA?

BenTheElder avatar Mar 11 '21 06:03 BenTheElder

@BenTheElder nothing other than someone has to do it :) oh, i don't know how to wire the ingress/dns stuff

i tried a long time ago :) https://github.com/kubernetes/k8s.io/pull/96

dims avatar Mar 11 '21 12:03 dims

What's the argument against hosting it on AAA?

I would say lack of artifact destined for aaa (aka no up-to-date container image for hound). We could host the image on k8s-staging-infra-tools.

ameukam avatar Mar 11 '21 13:03 ameukam

@ameukam should this issue be migrated to the k/k8s.io repo?

nikhita avatar Mar 24 '21 04:03 nikhita

@nikhita I'm not sure about the right place of this issue. Just wanted to put this under SIG Contribex TLs and Chairs radar.

ameukam avatar Mar 24 '21 07:03 ameukam

it should be under k/k8s.io imho. I think we should host it on AAA fwiw.

BenTheElder avatar Apr 22 '21 20:04 BenTheElder

Moving to k8s.io repo. slack discussion - https://kubernetes.slack.com/archives/CCK68P2Q2/p1623300972130500

nikhita avatar Jun 10 '21 05:06 nikhita

/sig contributor-experience /wg k8s-infra

spiffxp avatar Aug 09 '21 21:08 spiffxp

I took a stab at onboarding codesearch; @spiffxp could I get your input? I want to make sure I didn't miss anything. I want to stage all the infra, and get it deployed via prow first. Then we can follow up with another PR to cut-over DNS when we are ready.

https://github.com/kubernetes/k8s.io/pull/2513 https://github.com/kubernetes/test-infra/pull/23201

I could also work on adding the docker build logic after, but I haven't worked in that repo yet so I'll have to do some digging.

cc @dims

jimdaga avatar Aug 11 '21 03:08 jimdaga

/priority important-soon /milestone v1.23

spiffxp avatar Aug 17 '21 21:08 spiffxp

What about using https://sourcegraph.com/kubernetes to minimize the maintenance burden here? This is something I suggested to @dims in the past, but didn't have the bandwidth to do at the time.

justaugustus avatar Aug 18 '21 22:08 justaugustus

choices are:

  1. leave things where they are
  2. move to k8s wg infra
  3. redirect to cs.k8s.io to sourcegraph
  • i have been taking care of 1 already for a while with minimal downtime, so i am ok with continuing to do so
  • if someone wants to do 2, i am happy to work with help, show how things are setup and we can shut down the equinix vm
  • i personally don't like option 3, i love the the hound UX, if the consensus is we should go with 3, that is fine with me. I am happy to run a personal instance on a custom domain for myself (community is welcome to use)

if i missed any other options, please feel free to chime in.

dims avatar Aug 19 '21 02:08 dims

/unassign

spiffxp avatar Sep 17 '21 21:09 spiffxp

FYI: If choice 2 is picked, my two PRs are pretty much ready to stage codesearch in the aaa cluster. There are a few small things that need to happen after the merge, but that's documented in my PRs.

  • https://github.com/kubernetes/k8s.io/pull/2513
  • https://github.com/kubernetes/test-infra/pull/23201

jimdaga avatar Sep 18 '21 15:09 jimdaga

thanks @jimdaga

+1 to give #2 a shot. will let Aaron and Arnaud to review and merge all 3 PRs

dims avatar Sep 18 '21 21:09 dims

/milestone v1.24

ameukam avatar Dec 06 '21 17:12 ameukam

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 06 '22 18:03 k8s-triage-robot

/remove-lifecycle stale

ameukam avatar Mar 07 '22 06:03 ameukam

@ameukam what is remaining here?

nikhita avatar Mar 07 '22 07:03 nikhita

@ameukam what is remaining here?

Deploy a canary instance from https://github.com/kubernetes/k8s.io/pull/2513. Once we have confidence with that instance we can rollout a prod instance.

ameukam avatar Mar 07 '22 07:03 ameukam

/assign

@nikhita, I'm interested in helping with setting up a canary instance.

Priyankasaggu11929 avatar Mar 07 '22 07:03 Priyankasaggu11929

Post-merge checklist item from PR https://github.com/kubernetes/k8s.io/pull/2513 that need working on:

  • [ ] Publish cs-fetch-repo docker image (Open PR: https://github.com/kubernetes/test-infra/pull/25576)
  • [ ] Update deployment to use deployed docker image (using a temp image for now)
  • [X] After testing is completed, cutover DNS to new K8s hosted IP. (Done by https://github.com/kubernetes/k8s.io/pull/3416)

@pmgk07, once we're done with having https://github.com/kubernetes/test-infra/pull/25576 merged for adding cs-fetch-repos image under k8s infra, the next step would be updating the codesearch/deployment.yaml#L27 to use above hosted image.

Priyankasaggu11929 avatar Mar 10 '22 14:03 Priyankasaggu11929

Update deployment to use deployed docker image (using a temp image for now)

@Priyankasaggu11929 Let's give @jimdaga the final call about this. There are possible changes that need to be added the Docker image.

ameukam avatar Mar 10 '22 22:03 ameukam

Now that https://github.com/kubernetes/k8s.io/pull/3492 is merged, I see codesearch is deployed in the cluster!

However, it looks like the init containers are crashing:

kubectl get pods -n codesearch
NAME                         READY   STATUS                  RESTARTS   AGE
codesearch-5b975d449-lgm9b   0/1     Init:CrashLoopBackOff   8          19m
codesearch-5b975d449-zzqkl   0/1     Init:CrashLoopBackOff   8          19m

I'm out of the office right now, so I can't do a full debug. But it does seem like something needs fixing :( (I also don't have access to view pod logs, so not sure how to get that)

jimdaga avatar Mar 10 '22 22:03 jimdaga

Let's give @jimdaga the final call about this. There are possible changes that need to be added the Docker image.

+1. Yes 🙂

There's also an error for decoding ingress in the build-logs of the post-k8sio-deploy-app-codesearch job.

I've raised a minor patch fix: https://github.com/kubernetes/k8s.io/pull/3502

Priyankasaggu11929 avatar Mar 11 '22 05:03 Priyankasaggu11929

Now that #3492 is merged, I see codesearch is deployed in the cluster!

However, it looks like the init containers are crashing:

kubectl get pods -n codesearch
NAME                         READY   STATUS                  RESTARTS   AGE
codesearch-5b975d449-lgm9b   0/1     Init:CrashLoopBackOff   8          19m
codesearch-5b975d449-zzqkl   0/1     Init:CrashLoopBackOff   8          19m

I'm out of the office right now, so I can't do a full debug. But it does seem like something needs fixing :( (I also don't have access to view pod logs, so not sure how to get that)

You can use GCP Logging console for the logs: https://console.cloud.google.com/logs/query;query=resource.type%3D%22k8s_container%22%0Aresource.labels.namespace_name%3D%22codesearch%22;cursorTimestamp=2022-03-11T06:20:53.646489047Z?project=kubernetes-public.

I did a quick research based on the logs and it suggested the issue may be related to the architecture of the Docker image.

 skopeo inspect docker://jdagostino2/codesearch-fetch:0.1.7 | jq .Architecture
"arm64"

The image seems to be built using a arm64 processor but the GKE nodes are amd64. We should try to switch to gcr.io/k8s-staging-infra-tools and see what's happening.

ameukam avatar Mar 11 '22 06:03 ameukam