helpdesk icon indicating copy to clipboard operation
helpdesk copied to clipboard

[INFRA-2754] Realign repo.jenkins-ci.org mission

Open jenkins-infra-bot opened this issue 5 years ago • 12 comments
trafficstars

More context can be found here

Folks at Jfrog are investigating how to reduce repo.jenkins-ci.org costs.

They are still interested to sponsor us but want to be sure that the repository is only used for Jenkins stuff and not as a proxy cache for other purposes.


Originally reported by olblak, imported from: Realign repo.jenkins-ci.org mission
  • assignee: danielbeck
  • status: In Progress
  • priority: Minor
  • resolution: Unresolved
  • imported: 2022/01/10

jenkins-infra-bot avatar Oct 07 '20 07:10 jenkins-infra-bot

danielbeck:

Regarding bandwidth usage, INFRA-2772 was a pretty straightforward discovery.

Storage is more difficult. Right now my plan for the first step is to remove any artifacts that haven't been accessed in a while, exist in the upstream repo, and have the same checksum upstream, starting with the largest ones.

jenkins-infra-bot avatar Nov 13 '20 15:11 jenkins-infra-bot

[Originally related to: INFRA-2772]

jenkins-infra-bot avatar Jan 11 '22 07:01 jenkins-infra-bot

[Originally related to: INFRA-2812]

jenkins-infra-bot avatar Jan 11 '22 07:01 jenkins-infra-bot

Issues in this epic:

  • https://github.com/jenkins-infra/helpdesk/issues/2340
  • https://github.com/jenkins-infra/helpdesk/issues/2377
  • https://github.com/jenkins-infra/helpdesk/issues/2385
  • https://github.com/jenkins-infra/helpdesk/issues/2386

lemeurherve avatar Mar 07 '22 09:03 lemeurherve

Updating this issue after we had recent outages on JFrog (#2864 #2949 and eventually #2904).

Problem: we have beetween 20 to 30 % of requests on repo.jenkins-ci.org that are HTTP/404, which is causing bandwitdh and performance issues as per JFrog's message in https://groups.google.com/g/jenkins-infra/c/ZdyYIhlNJQY/m/QCdT5OZIAAAJ .

We might want to check #2385 as a first step. Ping @daniel-beck we need your help as we don't know how to identify (and if we can without the help of JFrog) these "rogue" requests.

dduportal avatar Apr 29 '22 16:04 dduportal

These are unrelated.

We proxy repo1, and #2385 is about artifacts proxied from there but not used for a Jenkins purpose. Some folks probably just point their Maven at our Artifactory and do some ML bullshit, for which artifacts often exceed 1GB.

404 is when they set up Maven to query us for artifacts and they don't exist in our repos, or Maven repo1, typically internal private-source stuff. There's tons of log spam related to this, so we know the paths, but since access is anonymous, we don't know who does that, so we cannot tell them to knock it off (we could infer some by artifact path, but 🤷 ). Since we do not have a reverse proxy, we also cannot patch the responses and serve them "please go away" responses.

daniel-beck avatar Apr 29 '22 18:04 daniel-beck

Knowledge sharing from #3101 :

  • https://github.com/jenkins-infra/helpdesk/issues/3101#issuecomment-1220826616
  • https://github.com/jenkins-infra/helpdesk/issues/3101#issuecomment-1229883900

dduportal avatar Sep 08 '22 07:09 dduportal

Summary of the recent meeting with JFrog:

  • We'll be able to get more metrics (access, size, bandwidth, etc.) once repo.jenkins-ci.org will be migrated to their new platform. They'll contact us for a proposal timeline (expecting ~1 hour outage).
  • repo.jenkins-ci.org is consuming 40 to 50 Tb of data per month. We have to stay under the 10 Tb per month limit. That will be a topic for upcoming 2022 Contributor summit and should result in a JEP: https://github.com/jenkinsci/jep/pull/393

dduportal avatar Sep 19 '22 16:09 dduportal

We've filed a public abuse report for the IP address 39.107.36.205 . Attempts to stop that abuse through private channels have failed. We'll continuing reporting that abuse to the public location until the abuse stops or we find ways to block the IP address.

MarkEWaite avatar Mar 21 '23 14:03 MarkEWaite

can we just block it at artifactory / get jfrog to block it?

timja avatar Mar 21 '23 15:03 timja

can we just block it at artifactory / get jfrog to block it?

Nope, we cannot on our own as it is a JFrog-managed platform. Jfrog is studying this.

The real question is: will it be sufficient (I mean: the abuser(s)s could switch IP and start again).

dduportal avatar Mar 21 '23 16:03 dduportal

JFrog is investigating and hopes to be able to block the IP address. We'll certainly keep people informed as we learn more from them.

MarkEWaite avatar Mar 21 '23 18:03 MarkEWaite

As far as I can tell, this issue is resolved. The Jenkins artifact repository no longer caches Maven central. The artifact caching proxy provides artifact caching for agents connected to ci.jenkins.io

MarkEWaite avatar Aug 19 '24 03:08 MarkEWaite