metacello icon indicating copy to clipboard operation
metacello copied to clipboard

[squeak] [rfc] New Bootstrapping Process

Open tom95 opened this issue 3 years ago • 12 comments

This is a request-for-comments and mostly concerns Squeak (but could also be used for other systems that do not ship Metacello by default). I will also be sharing this issue on the squeak-dev list.

Problem

Currently, Metacello bootstraps in Squeak using an older copy of itself that is hosted in parts on gemsource, smalltalkhub and squeaksource. As smalltalkhub is reaching its end of life and has been down repeatedly in the last few days, we investigated a different bootstrapping process.

Suggestion

We propose to bundle Metacello with all its dependencies in one .sar file and host it on e.g. files.squeak.org. This way, we both speed up the installation procedure and reduce points of failure. The .sar can be generated using the script below [1]. Duplicating Metacello and all its dependencies on e.g. squeaksource would also be an option. However, a single bootstrap-metacello.sar makes it more explicit that this is just a cache of this repo.

The Installer ensureRecentMetacello in Squeak could then be simplified to the version seen below [2].

Further Steps

I personally would want to go one step further than the above proposal: instead of keeping an irregularly updated version of Metacello in the bootstrap-metacello.sar, we could update the .sar file using Metacello's CI. This has the advantage that the extra step of downloading Metacello from this repo becomes redundant, as the most recent version will already have been installed via the .sar file.

Automatic updates to the .sar could either happen whenever the tests for the master branch pass, or through an explicit push to a release branch. The current method of installation does neither, it just pulls master independent of whether the tests pass.

Does this suggestion seem feasible? Do you see any blockers or potential to further improve this suggestion?


[1]: When run in an image that currently has metacello installed, produce a metacello.sar that can be dropped onto a clean installation of the same Squeak version.

| preamble zip |
zip := ZipArchive new.
preamble := String streamContents: [:preambleStream |
    preambleStream
        nextPutAll: '| loader |
((Smalltalk globals includesKey: #MetacelloStub) and: [(Smalltalk at: #Metacello ifAbsent: [nil]) = (Smalltalk at: #MetacelloStub)]) ifTrue: [
Smalltalk globals removeKey: #Metacello].
loader := MCVersionLoader new.';
        cr.
    (((BaselineOfMetacello project version: 'baseline') allPackagesForSpecNamed: #default) collect: #name) do: [:name | | stream version |
        stream := RWBinaryOrTextStream on: (String new: 10000).
        version := MCVersion
            package: (MCPackage named: name)
            info: (MCPackage named: name) workingCopy ancestry ancestors first.
        version fileOutOn: stream.
        (zip addString: stream contents as: name, '.mcz') desiredCompressionLevel: 0.
        preambleStream
            nextPutAll: 'loader addVersion: (MCMczReader versionFromStream: (self memberNamed: ''';
            nextPutAll: name;
            nextPutAll: '.mcz'') contentStream).';
            cr].
    preambleStream nextPutAll: 'loader load.'].
zip addString: preamble as: 'install/preamble'.
zip writeToFileNamed: 'metacello.sar'

[2]: Script that pulls the .sar file from a server and installs it, then updates Metacello:

SARInstaller new fileInFrom: (WebClient httpGet: 'http://localhost:8000/metacello.sar') content asByteArray readStream.

(Smalltalk at: #Metacello) new
  baseline: 'Metacello';
  repository: 'github://Metacello/metacello:master/repository';
  get; load.

tom95 avatar Aug 04 '20 17:08 tom95

@tom95 That sounds like an excellent plan ... the original Metacello bootstrap was predicated on the fact, that Metacello was not "part of the target smalltalk image" and I made the decision that Metacello itself should be bootstrapped "on demand", i.e., there was code in each ConfigurationOf that would cause Metacello to be bootstrapped so that it could be loaded without the developer having to "install something first" ... at this point in time it makes perfect sense for a developer to make a conscious decision to use Metacello and having a clean way of installing Metacello into Squeak is the right thing to do ... and making sure that as part of that install process the latest version of Metacello is loaded is also the right thing to do ...

dalehenrich avatar Aug 04 '20 20:08 dalehenrich

As far as I understand this problem, the proposed solution sounds great to me. +1 for creating the SAR via Metacello, we do not need to host any SAR file on the Squeak servers, we can just manually include it into the next release on this repository.

[2]: Script that pulls the .sar file from a server and installs it, then updates Metacello:

SARInstaller new fileInFrom: (WebClient httpGet: 'http://localhost:8000/metacello.sar') content asByteArray readStream.

(Smalltalk at: #Metacello) new
  baseline: 'Metacello';
  repository: 'github://Metacello/metacello:master/repository';
  get; load.

Is this already meant to be a final version? If yes, doesn't this version of #ensureRecentMetacello skip the installation of help contents?

LinqLover avatar Sep 30 '20 17:09 LinqLover

RE hosting on files.squeak.org: I think it'd make more sense to keep deployment artifacts close to a project, for example as part of GitHub releases. GitHub is behind a CDN, so potentially better connected than our file server. Plus we don't have to distribute any additional deployment keys for our file server (it works fine for trunk and the website, but it's no fun to set up). Let me know if you need any help.

fniephaus avatar Oct 08 '20 07:10 fniephaus

This is a one-off thing, I'd say. The bootstrap changes rarely; I vote agains gh-hosted SARs and for manually putting them on our infra.

krono avatar Oct 08 '20 07:10 krono

If the bootstrap changes rarely, we could also manually create a release and upload the SAR there. This would ensure Metacello devs have full control, and don't have to contact the webteam (I know the two groups overlap, but you know what I mean).

fniephaus avatar Oct 08 '20 08:10 fniephaus

Does the bootstrap change rarely? I think it would be interesting to build the latest SAR for Metacello on this repository on a regular basis via CI. This would speed up the initial installation of Metacello in your image because no updates need to be fetched just after installing the archive file - or am I missing something?

LinqLover avatar Oct 08 '20 18:10 LinqLover

Any updates on this? I need to set up a fresh Trunk image very often, and it is quite tedious to have to install Metacello from a SAR file every time manually ... :-)

LinqLover avatar Nov 07 '20 13:11 LinqLover

I think there is an issue with your @tom95​s SAR generation script when using the generated SAR file for a #19536 image (which smalltalkCI is using for Trunk builds currently). Because the projects are stored in an arbitrary order #allPackagesForSpecNamed:, Metacello-MC is loaded before Metacello-Core. Due to 78b2659bc5fd310a6db752cae00e175ea44bf08d, this leads to the removal of the String and Array extension method #setLoadsInMetacelloProject:. How should we fix this?

Option 1: This is a bug in SARInstaller (however, as far I can see this it is rather a limitation of the SAR concept by design). Option 2: We could sort the SAR packages manually in your script. Option 3: We could revise MetacelloMCVersion >> #allPackagesForSpecNamed: to order the packages according to their mutual dependencies. (Or, could we? At least this would go beyond my current MC skills.) By the way, I was already searching for exactly this functionality in https://github.com/hpi-swa/Squot/issues/290#issuecomment-704500770.

LinqLover avatar Nov 08 '20 18:11 LinqLover

Hey @LinqLover I'll hopefully get to investigate options for hosting the SAR tomorrow.

Concerning the dependency problem: I was not able to reproduce this on the most recent trunk image [1] with a SAR generated from my home image and just-installed Metacello updates. I also do not quite understand yet how the error could come to be - since Metacello-Core is not a prefix of Metacello-MC it should not touch anything installed by it. Could you retry with the most recent trunk image (or tell me which step I could have done wrong for reproducing the error myself :))?

[1] http://files.squeak.org/trunk/Squeak6.0alpha-20071-64bit/

tom95 avatar Nov 08 '20 18:11 tom95

@tom95 Great to hear that you are on it! :-)

The dependency problem does not occur when starting from a fresh trunk image but only when starting from #19536 which is used for smalltalkCI builds for Squeak Trunk [1, 2]. I tried to install the SAR file in a preLoading script in an SCI configuration (because I need to access MCFetchGithubRepository sitePassword before applying the load spec (I cannot simply use #useLatestMetacello because I need the update to be performed during the preLoading script already)) which broke the build process because the #setLoadsInMetacelloProject: implementors were removed.

Given the fact that in this particular situation, an older version of Metacello is already installed so there is no actual need to bootstrap, this should not be a critical point in the context of this issue. Still, I would like to find and eliminate the underlying cause of this weird and confusing behavior ... See below 🔽 :-)

[1] https://github.com/hpi-swa/smalltalkCI/releases/tag/v2.9.2 [2] https://github.com/hpi-swa/smalltalkCI/blob/b4d9b945f2d20d136ac21fe3ca129a5822ff0125/squeak/run.sh#L88-L105


Update on the dependency problem: Unfortunately, options 2 and 3 as proposed above will not work. The problem appears to be that SARInstaller collects all patches to be installed into one single MCVersionLoader instance. MCPackageLoader, however, always installs all additions before the removals (see #basicLoad). Let's discuss this on the mailing list, but I suppose that we should remove all operations from the removals list in this method once they were referenced in any additions operation.

LinqLover avatar Nov 08 '20 19:11 LinqLover

Can confirm that a change such as this one fixes the problem:

MCPackageLoader>>forgetSuperfluousMethodRemovals
	|  removedClasses |
	removedClasses := (removals select: #isClassDefinition) collect: #actualClass.
	removedClasses addAll: (removedClasses collect: #class).
	removals := removals reject: [:e | e isMethodDefinition and: [(removedClasses includes: e actualClass)
		" that's the new part "
		or: [additions anySatisfy: [:a | a isMethodDefinition and: [a selector = e selector and: [a className = e className]]]]]]

I'm very surprised this hasn't turned up earlier though. The problem should in theory occur each time a method is moved from one package to another, so I feel like there is some required preprocessing step that we're missing here. So I agree that this is something the mailing list will likely have an opinion on. Could you create a post @LinqLover ?

tom95 avatar Nov 08 '20 20:11 tom95

Thanks for your implementation proposal, and so sorry for the long delay! Please see The Inbox: Monticello-ct.732.mcz on the mailing list. I'm pretty sure that this an error or an unjustified limitation in MCPackageLoader that did not show up earlier just because usually, versions for different packages are not loaded using the same MCPackageLoader instance. Still, I'm not absolutely sure about this, so let's start the discussion on the mailing list! :-)

LinqLover avatar Nov 16 '20 15:11 LinqLover