jenkinsapi icon indicating copy to clipboard operation
jenkinsapi copied to clipboard

Undocumented "lazy" option + "lazy" not respected for "get_job"

Open ringerc opened this issue 6 years ago • 4 comments
trafficstars

  • Bug Report
Jenkinsapi VERSION

0.3.9

Jenkins VERSION

Any

SUMMARY

Instintiating jenkinsapi.jenkins.Jenkins is extremely slow for Jenkins instances with nontrivial numbers of jobs and/or high network latency. Enabling logging to see urllib3 output shows why:

import logging
logging.basicConfig(level = logging.DEBUG)
DEBUG:remote.py:Connecting to "https://xxx/jenkins/" as user "xxx@yyy"
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx:443
DEBUG:urllib3.connectionpool:https://ci.qa.2ndquadrant.com:443 "GET /jenkins/api/python?tree=jobs%5Bname%2Ccolor%2Curl%5D HTTP/1.1" 200 823
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx:443
DEBUG:urllib3.connectionpool:https://xxx:443 "GET /jenkins/job/zzz/api/python?tree=jobs%5Bname%2Ccolor%5D HTTP/1.1" 200 127
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): xxx:443
...

It's fetching information on every single job, synchronously, on instantiation.

I argue that's a defect. It simply shouldn't do that. If instantiation should do any network activity at all (which is itself dubious) it should at most validate that it can talk to Jenkins.

With that said, the undocumented lazy option already exists to fix this:

jenkins = Jenkins(my_url, username=my_user, password=my_api_key, lazy=True)
print(jenkins.version())

See related issues #688, #525

EXPECTED RESULTS

No network round-trips until the version() call, or at most a simple handshake at instantitation.

A single API call for version().

ACTUAL RESULTS

jenkinsapi walked the whole job tree and inspected every job when the Jenkins object was instantiated. This took 125 seconds, vs the 3 seconds that lazy instantiation + a version call took.

ringerc avatar Jun 07 '19 05:06 ringerc

Hm, seems a worse issue than I thought. If I attempt to fetch a job by name, the API also enumerates all jobs rather than requesting just the desired job by subpath.

jenkins = Jenkins(my_url, username=my_user, password=my_api_key, lazy=True)
print(jenkins.version())
# ok so far but now:
job = jenkins.get_job('somejob')
# ... that took 3 minutes as it walked all the jobs in all the folders, slooowly.

Expected behaviour: given JENKINS_URL or https://foo.bar/jenkins, get_job('baz') should fetch https://foo.bar/jenkins/job/baz/api/python?....

Actual behaviour: deep tree walk of all jobs.

get_job_by_url seems to require that the caller supply the job name, and doesn't build a URL based on the base URL, so you can't just

# this doesn't work
jenkins = Jenkins(my_url, username=my_user, password=my_api_key, lazy=True)
job = jenkins.get_job_by_url('myfolder/myjob', 'jobname')

or when retaining jenkins folder URL structure:

# this doesn't work
job = jenkins.get_job_by_url('job/myfolder/job/myjob', 'myfolder/jobname')

Instead the caller must build a URL?!

There's no get_jobs_url or get_job_url. So you seem to have to do something very counterintuitive (I must be wrong?) like, with crude/possibly incorrect URL joining, this hack:

jenkins = Jenkins(my_url, username=my_user, password=my_api_key, lazy=True)
job_url = jenkins.base_server_url() + "/job/bdr3/job/bdr3"
job = jenkins.get_job_by_url(job_url, 'bdr3/bdr3')
print(job.get_last_build_or_none())

ringerc avatar Jun 07 '19 05:06 ringerc

If only creators of Folders plugin made folders to be folders and not jobs within jobs... But this was not done and the only way to find if something is folder or job is to traverse whole hierarchy and check if that something has "color" (status of the job is "color", for historic reasons).

Also, jobs in different folders can have the same name, so direct link to https://foo.bar/jenkins/job/baz/api/python? will return first job that has name "baz", no matter if there are many jobs with this name.

Easiest approach was to let caller build an url and get the job, this will eliminate traversals though folder mess and return required job as fast as possible.

I am happy to get pull request that will propagate laziness to get_job.

lechat avatar Jun 07 '19 15:06 lechat

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 18 '19 17:10 stale[bot]

I second the suggestion to document the lazy parameter: https://jenkinsapi.readthedocs.io/en/latest/search.html?q=lazy

Lucas-C avatar Oct 21 '19 18:10 Lucas-C