firmware icon indicating copy to clipboard operation
firmware copied to clipboard

change buildbot directory structure

Open lynxis opened this issue 9 years ago • 13 comments

stable/ unstable/ (only weekly) snapshots/ (contains the PR builds and seperate branch builds).

What do you think?

lynxis avatar Aug 09 '16 01:08 lynxis

I'd rather use "release" than "stable": 1) That's what it is, 2) users won't have to guess where releases are (because who knows - other projects bury their releases in snapshots).

Similarly, for the others I'd rather say "development". I'd be happy with "development/branch-branchname/date-buildnum-hash/".

We don't need weekly builds of the master branch as master does not change without commits.

sarumpaet avatar Aug 11 '16 17:08 sarumpaet

+1 for sarumpaet flat structure But not sure if we need to keep this detailed structure in the "development"-tree. maybe only "development/buildnum-branchname-hash" this will even ease the automatic-cleanup of old build

SvenRoederer avatar Aug 15 '16 09:08 SvenRoederer

Cleanup can be done with find -ctime or some ls -t|tail magic, we don't need the buildnum for that. Actually I tend to leave that away completely as it's a buildbot artifact that doesn't contribute anything that the date doesn't already tell, and at worst it's confusing (people tend to assume "higher buildnum = more features". This is also why I'd prefer directories for branches as mixing builds from all branches in one directory will confuse people and makes giving away links ("just get the most recent build from http://buildbot.../builds/development/branch-master/") way more difficult.

sarumpaet avatar Aug 23 '16 15:08 sarumpaet

release
release/0.1.2
release/0.1.2/imagebuilder/ar71xx
release/0.1.2/packages/ar71xx
release/0.1.2/firmware/<router model>/default

release/latest -> 0.1.2

development/branch-name/ar71xx/
development/branch-name/datetime-hash/imagebuilder/ar71xx
                                     /packages/ar71xx
                                     /firmware/tl-wdr3500/default
                                     /firmware/tl-wdr3500/backbone

booo avatar Nov 02 '16 22:11 booo

maybe we should "expose" the current master-build a bit more, but not really sure if this is a good idea.

SvenRoederer avatar Nov 14 '16 00:11 SvenRoederer

@SvenRoederer Current development master will be /development/branch-master/latest with that scheme always. We can symlink that as /development/latest-mostly-stable or something. ;)

As a note, we'll have to patch upload_directory and repo_url() in https://github.com/freifunk-berlin/buildbot/blob/master/masters/master/master.cfg as well as the firmware Makefile for all this.

sarumpaet avatar Nov 15 '16 17:11 sarumpaet

I hacked https://util.berlin.freifunk.net/hardware?name=tl-wr841-v9 which just looks at the files available in the buildbot and creates a comprehensible list (it doesn't use any internal mapping). We can also link to it easily in the mails that config.berlin sends (instead of linking to .bin files directly) and from the download page.

sarumpaet avatar Nov 22 '16 16:11 sarumpaet

There is currenly a lot of builds going on. With the new 1.1.0-rc1, the images dissapeared after about 48 hours. I feel it makes sense to keep these in a similar way that the releases are keps.

I have been working on an extension to buildbot which creates yet another directory for release-candidates. Please take a look and give me some feedback.

https://github.com/freifunk-berlin/buildbot/tree/release-candidate

There would need to be similar changes to util.berlin.freifunk.net. I will try to work on it and post here if I make any progress.

pmelange avatar Jun 16 '19 14:06 pmelange

@sarumpaet I have been looking at https://github.com/freifunk-berlin/util.berlin.freifunk.net/blob/master/www/hardware.php and realized how poor I am at PHP. I feel it would take me more than a day to figure out how to add the release-candidate directory.

Would it be a lot of work for you to add this? I imaged that release candidates would only be shown when $complete=true.

pmelange avatar Jun 16 '19 15:06 pmelange

There is currenly a lot of builds going on. With the new 1.1.0-rc1, the images dissapeared after about 48 hours. I feel it makes sense to keep these in a similar way that the releases are keps.

I don't understand. The build images get deleted because disk space is running out. Keeping RCs will occupy even more disk space, thus it's no solution to the broad problem at all.

buildbot master has to be upgraded anyways; its root/OS partition is running out of space, too. We might as well add more space for builds then.

sarumpaet avatar Jun 18 '19 12:06 sarumpaet

The Buildbot may be running out of space and needs to be upgraded, and that is something we should also deal with.

But the problem that I am describing is that only the 10 most recent "unstable" builds are kept. If, for example, someone were to have a failed build, and press rebuild 10 times, then all "unstable" builds would be removed from the system.

For example, in https://buildbot.berlin.freifunk.net/buildbot/unstable/x86-generic/ build number 1335 failed, and now there are only 9 builds remaining. Now, go to https://buildbot.berlin.freifunk.net/builders/x86-generic/builds/1335 and click on rebuild. The build will fail again, and there will be only 8 remaining.

I don't think it is too much to want a release candidate to be around for over 48 hours.

pmelange avatar Jun 18 '19 12:06 pmelange

But the problem that I am describing is that only the 10 most recent "unstable" builds are kept. If, for example, someone were to have a failed build, and press rebuild 10 times, then all "unstable" builds would be removed from the system.

wasn't it once, that the most recent directories were kept. No relation if the build was successful or failed. If it's like @pmelange it seems a failure of the buildbot-configuration.

SvenRoederer avatar Jun 18 '19 18:06 SvenRoederer

@pmelange That's not the case? I.e., I hit the rebuild button for 1335, which failed again (1343), but we still have 9 (complete) builds available, just as before I hit the button. Try it yourself.

The delete logic is this: https://github.com/freifunk-berlin/puppet/blob/70424dbcc44f9afb45e02851aa63ab667211bb66/puppet/manifests/site.pp#L220 https://github.com/freifunk-berlin/puppet-files/blob/d07aa6402b657b5b8911684315feeff95321b272/files/buildbot-remove-old-builds.sh#L1 (Why are these two repos?) ...and actually, on the server (which is out of sync with Puppet), it's

BASE_DIR="/usr/local/src/www/htdocs/buildbot/unstable"
TARGETS=$(ls ${BASE_DIR})
for target in ${TARGETS}; do
  for build in $(ls -t "${BASE_DIR}/${target}" | tail -n +10); do
    rm -r "${BASE_DIR}/${target}/${build}"
  done
done

...so the last 10 non-failed builds should be kept. I don't know why we only have 9 at the moment; possibly the cronjob got confused by temp files or something?

But yes, the logic should be better, e.g., delete builds only if space runs low. Possibly if a completed build is uploading? That's probably complicated to implement though. Anyways, resize partitions and up the number of builds first, as keeping RCs right now will not solve anything really (I bet if we implement keeping RCs before resizing partitions, the buildbot will break quite soon).

We need to update quite a bit of infrastructure. Buildbot is still on Ubuntu 16, config.berlin is still on Ubuntu 14...

sarumpaet avatar Jun 22 '19 23:06 sarumpaet