dao icon indicating copy to clipboard operation
dao copied to clipboard

Reorganizing Dao modules!

Open daokoder opened this issue 11 years ago • 80 comments

As I have mentioned in another thread, I have been considering to reorganize the modules in the following way:

  • Dao(fossil)/dao(git) will only include core modules which provide important functionalities, but no user-accessible types and methods (namely they are not used in coding);
  • DaoModules/dao-modules will only include standard modules without external dependency. Namely, they can only use standard C and system libraries;
  • DaoTools/dao-tools will only include standard tools without external dependency.

This means only the following modules will stay with Dao: auxlib (it will be changed to include only auxiliary C interface functions), debugger, help (it does offer methods for accessing the helps, this is the only exception I am going to make), macro (maybe) and profiler. The other modules currently with Dao will be moved to DaoModules. The following modules will be moved out and become individual project/repo probably: cblas, clinker, DaoCXX, DaoJIT.

The modules with external dependency will be managed by a standard packaging tool. The reason for this change is that, currently it is really inconvenient to deliver some useful modules and tools to users because of dependency issues.

This packaging tool will be able to handle the dependency issues of each module, and can download, configure and build the dependent libraries and the modules. There will be an archive from which the dependent libraries and the modules can be downloaded.

The basic components of this packaging tool are already available in the standard modules (os.fs, web.http, zip and pkgtools). The use of these components is shown in https://github.com/daokoder/dao-tools/blob/master/daopkg/daopkg.dao, where daopkg is intended as the packaging tool.

Anyone like to volunteer for developing this packaging tool daopkg?

daokoder avatar Sep 12 '14 10:09 daokoder

DaoModules/dao-modules will only include standard modules without external dependency. Namely, they can only use standard C and system libraries;

You didn't mention zip and web.http, so I suppose including third-party library source into the module doesn't count as external dependency?

Anyone like to volunteer for developing this packaging tool daopkg?

You do know that 'anyone' basically means me? :) I am actually glad to have an opportunity to write something in Dao, so I'm in :) Packages, however, require more then just source files. A description is needed which would minimally specify the dependencies and some auxiliary information.

Night-walker avatar Sep 12 '14 11:09 Night-walker

You didn't mention zip and web.http, so I suppose including third-party library source into the module doesn't count as external dependency?

Yes, their source is small, and they have to be included for the packaging tool, which must not have external dependency.

You do know that 'anyone' basically means me? :) I am actually glad to have an opportunity to write something in Dao, so I'm in :)

Right, though I also had @dumblob in mind, but he is not very active recently, so it is basically you:). Really great that you take it so promptly.

Packages, however, require more then just source files. A description is needed which would minimally specify the dependencies and some auxiliary information.

Yes, we may need a module description format for this. This is where I stopped after adding web.http, zip and pkgtools etc. Or we can do as how homebrew does it for mac, where a base class is provided for all packages, and each package is managed by a script that extends the base class.

daokoder avatar Sep 12 '14 11:09 daokoder

Hi guys, let me apologize for another week without any activity. I've moved (net yet permanently, but one never knows :)) two weeks ago to a foreign country and I have so much work to do (you might have noticed that I was active mainly at the weekends, but not throughout the workweeks).

Anyway, this issue caught my attention because I've done quite a lot of packaging in past. I'd recommend not to reinvent a wheel and do things as simple as possible to allow other feature-full tools specifically tailored for packaging (OS/distribution -specific) to just parse/grab the description and automatically transform it to their formats without user intervention (imagine fpm). I'd start with looking at the lists "Package source types" and "Target package types" at https://github.com/jordansissel/fpm/wiki .

Tomorrow, I'll try to do again something for Dao (I'm missing it so much :(), but now I have to manage some other, slightly more urgent, stuff :(.

dumblob avatar Sep 12 '14 12:09 dumblob

Hi guys, let me apologize for another week without any activity. I've moved (net yet permanently, but one never knows :)) two weeks ago to a foreign country and I have so much work to do (you might have noticed that I was active mainly at the weekends, but not throughout the workweeks).

Really no need to apologize, it is quite understandable that you, and in fact everyone of us, have other things to tend. And there is no obligation or anything like that. You guys have already contributed so much, I really appreciate that.

I'd recommend not to reinvent a wheel and do things as simple as possible to allow other feature-full tools specifically tailored for packaging (OS/distribution -specific) to just parse/grab the description and automatically transform it to their formats without user intervention (imagine fpm). I'd start with looking at the lists "Package source types" and "Target package types" at https://github.com/jordansissel/fpm/wiki .

I agree we should do things as simple as possible. But regarding reinventing a wheel, it really depends on the situation. I remember I was once (or maybe a few times) suggested not to develop a new VM and use Parrot VM (the VM for Perl 6) instead. If I had done so, this project is probably already dead.

We already have reinvented a new make tool, which works quite well. Without it, managing the current modules, supporting single file deployment, supporting compiling to Javascript would have been very difficult and boring tasks. So sometimes you need to do things your own way (that might be reinventing a wheel) in order to make things easier in the long run.

For the packaging tool I have mentioned, it is preferable to be simple and has no external dependency other than Dao and the standard modules (of course, other small dependencies that can be solved by including the source may also be acceptable. fpm has way too much dependency). The parts for handling files, directories, network, compression, decompression and archiving are already ready, so it would not be reinventing a wheel from scratch.

From what I understood, @Night-walker wants to do it in Dao. That's really great:). (And that's how I intended to it myself eventually if he and no one else volunteers.)

daokoder avatar Sep 12 '14 14:09 daokoder

There is a couple of things regarding the hypothetical package manager that require clarification.

  1. Package index. All the packages should be registered somehow, as otherwise how the tool is supposed to work with modules in different, independently updated repositories?
  2. Package content. What content and in what form may a package contain? Package format, possible included files, their interpretation?
  3. Installation. What steps the tool is expected to do during installation of a package? What about building, external dependencies, scripts?

Night-walker avatar Sep 12 '14 15:09 Night-walker

For the packaging tool I have mentioned, it is preferable to be simple and has no external dependency other than Dao and the standard modules (of course, other small dependencies that can be solved by including the source may also be acceptable. fpm has way too much dependency). The parts for handling files, directories, network, compression, decompression and archiving are already ready, so it would not be reinventing a wheel from scratch.

Of course. This wasn't what I meant. I didn't mean not to write a new tool. I meant not to come up with new formats, new specifications, new behavior, new types of indexes etc. In other words, I meant to use existing infrastructure (if any) and existing standards/specifications for describing packages/dependencies, i.e. exactly what @Night-walker is asking for right now in https://github.com/daokoder/dao/issues/251#issuecomment-55416193 .

Look at the formats fpm supports, choose the smallest possible functionality these formats support and is sufficient for our modules/packages and implement it in Dao. We can mimic/get inspired by CPAN, PyPI, Cabal, npm, LuaRocks and many others. But as I said - if we make it very similar to some of them or even absolutely the same (from the API point of view), we'll have a big advantage as the existing tools won't have to be accommodated (especially their logic inside) to yet-another-lang-specific-package-repository.

dumblob avatar Sep 12 '14 15:09 dumblob

Btw imho the best packaging tool I've ever seen is GNU Guix - it's worth looking at it's core principles to get an impression how complicated packaging could get eventually and how to solve these problems in a fashionable, readable and efficient way.

dumblob avatar Sep 12 '14 15:09 dumblob

Package index. All the packages should be registered somehow, as otherwise how the tool is supposed to work with modules in different, independently updated repositories?

For package index, how about dao-category-name-version, where category could be mod for modules, tool for tools and dep or ext for external dependency libraries or tools. For example, the clinker module could be indexed by dao-mod-clinker-0.5.0, and its external library libffi by dao-ext-libffi-3.0.13.

All the packages, external libraries and tools will be archived in a central place for download. For continuously developed modules, only snapshots or certain versions will be archived there.

All the archived packages, libraries and tools will be stored as a simple customized archive format with compression. The pkgtools/archive.dao module can do archiving and extraction, zip module can do compression and decompression as shown in tools/daopkg/daopkg.dao.

Package content. What content and in what form may a package contain? Package format, possible included files, their interpretation?

A package can be a source package or binary package (mostly for Windows and Mac, I assume). A source package should contain the source files, makefiles or configuration files etc. For an external library, if it requires certain build tool to build, this tool could also become an dependency and archived in the central place. Separately or together, a package description file should also be available. This description file should specify the dependency of the package, and the steps required to build.

Installation. What steps the tool is expected to do during installation of a package? What about building, external dependencies, scripts?

The tool should be able to know what packages have been installed, what are available for installation. During installation, it should be able to check if the package to be installed is already installed, or already downloaded. And it can download the package if necessary (maybe check if a corresponding binary package is available first), and build it.

The dependency and commands for building (and other information) should be specified in the package description file which should be created for each package. @dumblob probably knows better what should be included in such as package description file.

I meant not to come up with new formats, new specifications, new behavior, new types of indexes etc.

For a moment, I thought that's what you mean, but I was not sure.

Look at the formats fpm supports, choose the smallest possible functionality these formats support and is sufficient for our modules/packages and implement it in Dao. We can mimic/get inspired by CPAN, PyPI, Cabal, npm, LuaRocks and many others.

Indeed, we can learn from their format to make a simple and adequate one if necessary.

But as I said - if we make it very similar to some of them or even absolutely the same (from the API point of view), we'll have a big advantage as the existing tools won't have to be accommodated (especially their logic inside) to yet-another-lang-specific-package-repository.

I don't know about those format, if there is one that is really simple and is well defined, we could indeed simply adopt it.

Or we can simply use Dao data structures (code) to store package information, a bit like JSON, then it only needs to be evaluated instead of parsing in order to extract the information:)

daokoder avatar Sep 12 '14 16:09 daokoder

Taking into account the current state of development of Dao and its modules, maintaining package snapshots is tedious. API changes and bug fixes will inevitably require frequent package updates, so it seems better at the moment to link right to the active sources of packages in repositories. That means that there should be package registry (a list, to put it simply) which would specify each package's name, description, dependency list and, finally, URL of its location.

It should be the easiest way to manage packages; the package tool would just be a wrapper on top of git/fossil, and a lot of essential functionality will then be available out-of-the-box.

Or we can simply use Dao data structures (code) to store package information, a bit like JSON, then it only needs to be evaluated instead of parsing in order to extract the information:)

Executing arbitrary code is probably not a good idea. Even without security issues, it's a rather questionable way to handle package description files.

Night-walker avatar Sep 12 '14 17:09 Night-walker

Well, I've right now tried to quickly come up with something feasible yet KISS and I've ended up with a similar (but still a simpler) system as npm. I.e.

  • network of hubs (representing different package namespaces - e.g. some company will need their own hub and they'll therefore need to distinguish their packages from others)
  • enforced package naming (the @daokoder's scheme was OK, but missing the namespace - it should have been rather dao-namespace-category-name-version)
  • description file with 6 mandatory fragments:
    1. name satisfying regex
    2. version - with dumb lexicographical comparison run for each component separately; components divided by .
    3. license - a logical expression consisting of logical operators and Short Names from https://fedoraproject.org/wiki/Licensing:Main?rd=Licensing#Software_License_List
    4. sources - array of URIs (at least one element)
    5. description - an arbitrary text with limited length
    6. dependencies - array of URIs (might contain no elements) to other hub packages (yes, it's not a mistake - e.g. name+version is not sufficient, we need full URIs); note that we don't need any comparison operators if we use the lexicographical comparison - we can omit e.g. .3.96 from the full version 5.3.96 and immediately it'll match the "highest" version found on the hub

Nothing more for the beginning (generating templates and auto-checking etc. is only a matter of time, nothing complex). Just keep in mind that we need to support different versions of the same package/library on one system (i.e. I'll have two different versions of Dao VM installed simultaneously and also corresponding packages to each of them). Need for this is rapidly increasing and almost no package SW supports it. Also each dll or so module from the Dao repository must be versioned (disregarding if it's only internal or not).

dumblob avatar Sep 12 '14 18:09 dumblob

A balanced design, I must admit. Just one question: how do you propose to organize versioning? It's pretty simple to identify package changeset by its hash, but I don't see a straightforward way to attribute custom version strings to it.

Night-walker avatar Sep 12 '14 19:09 Night-walker

Taking into account the current state of development of Dao and its modules, maintaining package snapshots is tedious. API changes and bug fixes will inevitably require frequent package updates, so it seems better at the moment to link right to the active sources of packages in repositories. That means that there should be package registry (a list, to put it simply) which would specify each package's name, description, dependency list and, finally, URL of its location.

I originally considered only releases, no snapshots. I said it since you mentioned continuously developed modules. Maybe we should not consider this, for simplicity, and supporting it is not very useful anyway.

It should be the easiest way to manage packages; the package tool would just be a wrapper on top of git/fossil, and a lot of essential functionality will then be available out-of-the-box.

I have also considered fossil (git maybe too big and has too much dependency), which has nearly has no extra dependency other than sqlite3. However, the packaging tool should handle not only Dao modules, but also the external libraries, which is hardly appropriate for fossil.

So I suggest we keep thing simple, and make the tool only to handle releases. The tool should also be able to update the package information automatically (or semi-automatically at least) for Dao modules.

Executing arbitrary code is probably not a good idea. Even without security issues, it's a rather questionable way to handle package description files.

Executing arbitrary code is bad, I was considering to evaluate such code by adding a pair of curly brackets or anything that makes loading statement invalid. Without loading, any code is harmless (infinite loops would be the worst thing, but it can be interrupt). We can also inspect the code (bytecode) for calls to ensure nothing can be called, this will avoid reading and writing files.

daokoder avatar Sep 13 '14 01:09 daokoder

how do you propose to organize versioning? It's pretty simple to identify package changeset by its hash, but I don't see a straightforward way to attribute custom version strings to it.

If I understood you correctly, a plain echo "type.$(git rev-list --count HEAD).$(git rev-parse --short HEAD)" should suffice for the lexicographical comparison. E.g. on linux, so libraries support characters [A-Za-z0-9_.] (and maybe a few others which I forgot) in the version string even if it's not the usual way to do things.

The type component should designate where does this package belongs to (always one of bleeding-edge, testing, stable - we might introduce others if needed, but for the beginning just these; btw I'm not sure about proper short keywords for these types - any ideas?). And yes, we need this information directly in the version and not only on the packaging level (otherwise conflicts arise when installing multiple same versions of different types on one system). I'm though not sure about dlls on Windows, neither on MacOSX and other systems which Dao supports (like Haiku). This needs investigation. @Night-walker, do you know what everything WinAPI allows? @daokoder, what about MacOSX and FreeBSD?

In case of bleeding-edge (usually VCS), having the second component before the second . being a monotonically increasing number should be robust enough. Each VCS should provide some monotonically increasing index of the given commit, so generating such description can be fully automated.

Either way, we really can't avoid proper versioning scheme for the non-bleeding-edge types. So we need to introduce at least major + minor whereas majors will be always compatible and minors not.

Maybe we should not consider this, for simplicity, and supporting it is not very useful anyway.

I must argue on this. There are plenty of use-cases, where one want the simplicity of packages (especially installing them with all it's advantages), but a bleeding-edge SW (i.e. directly from repository). In the end it's not so difficult to support it (see above).

Executing arbitrary code is bad, I was considering to evaluate such code by adding a pair of curly brackets or anything that makes loading statement invalid. Without loading, any code is harmless (infinite loops would be the worst thing, but it can be interrupt). We can also inspect the code (bytecode) for calls to ensure nothing can be called, this will avoid reading and writing files.

At the beginning, we would support only Dao modules with makefile.dao files. In the future, we can add something more, but it may be a good idea to stick to the design of packaging we came up with and provide the missing makefile.dao in external projects just in a second source URI. This will keep things KISS and easily maintainable => very well automatable. In other words, we shouldn't support turing-complete code in the description files (again, maybe in the future, but I doubt it's a good idea as enforced uniformity and declarativeness is always much easier way to go :)).

dumblob avatar Sep 13 '14 06:09 dumblob

I originally considered only releases, no snapshots. I said it since you mentioned continuously developed modules. Maybe we should not consider this, for simplicity, and supporting it is not very useful anyway.

It would be much simpler and more convenient to work with sources.

First, I can't think of any releases right now. Dao is not Debian stable, fixing packages is a sure way to have a lot of problems keeping them up to date, as in the case of Dao the latter is the better. Releases only make sense only when large auditory is involved and stability is the primary concern.

Second, working directly with sources eliminates the need to constantly upload new versions of packages somewhere. After registering a package, little is needed to maintain it. Only in case meta-information has changed one would have to update the corresponding entry, and it would be fairly trivial. Package registry can itself be just a repository.

Third, it's simple to fix a package at some version -- just don't update its entry at the registry. Then, regardless of the state of its source, the package manager will make use of only the specified version. It's much simpler that way, when you just have the source rather then a bunch of differently-versioned packages for the same module or tool.

However, the packaging tool should handle not only Dao modules, but also the external libraries, which is hardly appropriate for fossil.

And what? I didn't say just use plain fossil, I mean to integrate its abilities into the package manager, together with the other stuff. It would really save a lot of time and efforts. After all, what OSS is good for if we have to develop everything from scratch, again and again?

So I suggest we keep thing simple, and make the tool only to handle releases. The tool should also be able to update the package information automatically (or semi-automatically at least) for Dao modules.

It would be around 5-10 times harder to implement, and considerably more cumbersome to maintain and use, that's my opinion.

Executing arbitrary code is bad, I was considering to evaluate such code by adding a pair of curly brackets or anything that makes loading statement invalid. Without loading, any code is harmless (infinite loops would be the worst thing, but it can be interrupt). We can also inspect the code (bytecode) for calls to ensure nothing can be called, this will avoid reading and writing files.

There are config/serialization formats for this, no need to invent anything like that. If we don't need executable code in package description, there is no reason to write it using programming language at all.

Night-walker avatar Sep 13 '14 07:09 Night-walker

If I understood you correctly, a plain echo "type.$(git rev-list --count HEAD).$(git rev-parse --short HEAD)" should suffice for the lexicographical comparison. E.g. on linux, so libraries support characters [A-Za-z0-9_.](and maybe a few others which I forgot) in the version string even if it's not the usual way to do things.

Yes, but identifying package version by its SHA-1 hash is arguably not a particularly human-friendly way of distinguishing versions. However, it's possible to associate each version number in the package registry with the relevant changeset ID in the repository -- if, again, the package manager works with sources.

The type component should designate where does this package belongs to (always one of bleeding-edge, testing, stable - we might introduce others if needed, but for the beginning just these; btw I'm not sure about proper short keywords for these types - any ideas?). And yes, we need this information directly in the version and not only on the packaging level (otherwise conflicts arise when installing multiple same versions of different types on one system).

And yet again, it would be fairly trivial to maintain several "types" of the same package in a single repository using branching. The "type" can thus be easily unified with the version number, becoming some kind of generic tag pointing to a particular source changeset.

I must argue on this. There are plenty of use-cases, where one want the simplicity of packages (especially installing them with all it's advantages), but a bleeding-edge SW (i.e. directly from repository). In the end it's not so difficult to support it (see above).

Yes. It is, as I pointed out, is actually much simpler.

At the beginning, we would support only Dao modules with makefile.dao files. In the future, we can add something more, but it may be a good idea to stick to the design of packaging we came up with and provide the missing makefile.dao in external projects just in a second source URI. This will keep things KISS and easily maintainable => very well automatable. In other words, we shouldn't support turing-complete code in the description files (again, maybe in the future, but I doubt it's a good idea as enforced uniformity and declarativeness is always much easier way to go :)).

Again, package description and makefile is not the one and same thing. Package description is pure data, meta-information on the package, and makefile is some service file within the package containing instructions on how to build it. They aren't related.

Night-walker avatar Sep 13 '14 08:09 Night-walker

And yet again, it would be fairly trivial to maintain several "types" of the same package in a single repository using branching. The "type" can thus be easily unified with the version number, becoming some kind of generic tag pointing to a particular source changeset.

We can't do the unification (we need to maintain separated "type" from the version itself; "type" references a collection of old, current and also future releases/commits/whatever) for easy/automated maintainance of packages which depend on whatever version of a given package, but e.g. under condition, that it's a stable version.

Again, package description and makefile is not the one and same thing.

Of course, no doubts about this.

Package description is pure data, meta-information on the package, and makefile is some service file within the package containing instructions on how to build it.

Sure, this is obvious.

They aren't related.

Surprisingly, they are :) Not necessarily explicitly - we need a point in an automated generation and building of packages, where we switch from processing of the package metadata to building of the package (and/or vice versa). Basically we need two things for this - what to run to build it and which version we want to build.

dumblob avatar Sep 13 '14 08:09 dumblob

We can't do the unification (we need to maintain separated "type" from the version itself; "type" references a collection of old, current and also future releases/commits/whatever) for easy/automated maintainance of packages which depend on whatever version of a given package, but e.g. under condition, that it's a stable version.

If package is supposed to be obtained from a repository, only "type" plus version together may identify a changeset. Alone they don't make any sense unless there is something else associated with "type".

we need a point in an automated generation and building of packages, where we switch from processing of the package metadata to building of the package (and/or vice versa). Basically we need two things for this - what to run to build it and which version we want to build.

That should be trivial -- checkout the specified revision and then run makefile.dao.

Night-walker avatar Sep 13 '14 10:09 Night-walker

Alone they don't make any sense unless there is something else associated with "type".

So far we have only 3 types (disjunct sets of versions). And it does make very much sense to specify only the "type". There are packages which don't care about version, they just need something from the other package for some reason (e.g. just some additional stuff for user, but not mandatory for the package itself) and if we don't want to introduce complicated dependencies, hard dependencies, build-time dependencies, soft dependencies and so on and want to stick with the simplicity of lexicographical comparison, then we have actually no other option.

That should be trivial -- checkout the specified revision and then run makefile.dao.

Yes, and we need our package-build-tool to do that (implicitly as I called it).

Btw the only thing I'm worried about are the build-time dependencies. So far we don't need them (as they're the same as the resulting package dependencies), but I'm not sure about the future. Anyway, we can add them at any time (just one more array in the description file) without much burden of package tools changes.

dumblob avatar Sep 13 '14 11:09 dumblob

This needs investigation. @Night-walker, do you know what everything WinAPI allows?

Forgot to answer that. No one really knows what everything WinAPI allows. My, I should try myself in poetry :)

So far we have only 3 types (disjunct sets of versions). And it does make very much sense to specify only the "type". There are packages which don't care about version, they just need something from the other package for some reason (e.g. just some additional stuff for user, but not mandatory for the package itself) and if we don't want to introduce complicated dependencies, hard dependencies, build-time dependencies, soft dependencies and so on and want to stick with the simplicity of lexicographical comparison, then we have actually no other option.

I suppose anything related to the repository itself can be expressed as certain revision ID. The latter can be associated with whatever anything suitable for humans, it doesn't matter much from the technical point of view. External dependencies do require special treatment, but there is doubtfully any realistic way to handle all kinds of them in fully automated mode.

Btw the only thing I'm worried about are the build-time dependencies. So far we don't need them (as they're the same as the resulting package dependencies), but I'm not sure about the future. Anyway, we can add them at any time (just one more array in the description file) without much burden of package tools changes.

I think we shouldn't worry about too much possibilities and variety at the moment.

Night-walker avatar Sep 13 '14 11:09 Night-walker

Forgot to answer that. No one really knows what everything WinAPI allows. My, I should try myself in poetry :)

And what about knowing what everything WinAPI allows in the domain of DLL versioning? :)

I suppose anything related to the repository itself can be expressed as certain revision ID. The latter can be associated with whatever anything suitable for humans, it doesn't matter much from the technical point of view.

Technically we don't need any versions nor revisions, just timestamps designating a state in the World's history. Packaging is unfortunately not about finding a minimal set from the technical/physics point of view, but rather what the humans think about certain revision, build, snapshot etc. Therefore the hierarchy. From my experience the "type" is useful and solves not insignificant amount of problems on big systems/clusters/mainframes with many instances of different SW (packages and libraries) deployed.

Btw the statement "can be associated" would mean some mapping written somewhere - most probably again in the description file as some special keyword or being implicit (the worst case) somewhere else. I dislike both of these.

dumblob avatar Sep 13 '14 12:09 dumblob

I think we shouldn't worry about too much possibilities and variety at the moment.

Completely agree. I think some of our discussions have digress quite a bit from the real topic.

Let's focus on one thing first, that is how we organize the packages? How we index / reference them?

If we are to use fossil (I prefer fossil over git for its smallness and efficiency, I feel it is much faster than git), we can do something like this:

  • One fossil repository per package (Dao modules, external libraries etc.);
  • A tag is created for each installable revision of the package; The tag name is composed of type-version, where is the type can be stable ('release'), unstable (devel), alpha or beta etc., and the version number could be one to three numbers separated by dots;
  • A dependency file is added for each package; Each line in this composes of a package (fossil repo) name and a fossil tag name;
  • A building instruction file is added for each package;
  • The packaging tool can check the repositories for such tags, and create one package description file per each tag, automatically from the tag name, and the dependency file plus the building instruction file;
  • Such package description files will be archived in a central place or repository for automatic checking.

Please add anything I missed, and keep simplicity and focus in mind:)

daokoder avatar Sep 13 '14 14:09 daokoder

A dependency file is added for each package; Each line in this composes of a package (fossil repo) name and a fossil tag name

Why not to use directly full URIs? Without them, it's not a unique identifier. Anyway, I doubt this file would be useful - we should use the package description file instead.

A building instruction file is added for each package

E.g. the current makefile.dao? If not, I hope you don't want to introduce any new. Also the makefile.dao could contain targets for generating/updating the package description file if convenient.

The packaging tool can check the repositories for such tags, and create one package description file per each tag, automatically from the tag name, and the dependency file plus the building instruction file

IMHO, the packaging tool should check the repositories for such tags, and if needed (e.g. due to a newer version) update the existing package description file and upload it to the central repository. User can choose which tag type should be processed and uploaded (by selectively listing them or choosing an all option).

Otherwise I'm comfortable with the proposed solution (btw I got surprised that SQL DB is faster than git :)).

dumblob avatar Sep 13 '14 15:09 dumblob

Why not to use directly full URIs? Without them, it's not a unique identifier. Anyway, I doubt this file would be useful - we should use the package description file instead.

Package name plus tag name is a unique identifier, if we define unique package names (this is a must) and unique tag names (this is preferable). If necessary, changeset id can also be included. I believe they are equivalent to URIs.

E.g. the current makefile.dao? If not, I hope you don't want to introduce any new.

Not makefile.dao, each package should be built with its standard means of building, for Dao modules and tools, it is DaoMake with makefile.dao. But for external libraries and tools, they could be anything such as configure and cmake etc., these build tools themselves could be packaged and become allowed dependency for those external libs and tools. The building instruction would mean which build tool to use and what parameters to use. The packaging tool will invoke these building tools with suggested parameters.

Also the makefile.dao could contain targets for generating/updating the package description file if convenient.

It is better to let the packaging tool to generate or update the package description files.

IMHO, the packaging tool should check the repositories for such tags, and if needed (e.g. due to a newer version) update the existing package description file and upload it to the central repository.

That's basically what I mean or implied. If we use fossil to archive the package description files, adding to a repo will simply mean adding or updating. Of course, this can also be handled by the packaging tool.

Otherwise I'm comfortable with the proposed solution (btw I got surprised that SQL DB is faster than git :)).

Probably it is not because of SQL DB that fossil is faster than git. I think it is because fossil stores the revision histories much more efficiently, namely using much less space. I saw this difference with the Dao repos:

dao=>> ls -lthr Dao.fossil 
-rw-r--r--  1 user  staff   9.6M Sep 12 22:41 Dao.fossil
dao.git=>> du -h -d 0 .git
124M    .git

I also saw large difference in the amount of data transfering when cloning a fossil repo and a git repo.

daokoder avatar Sep 13 '14 16:09 daokoder

I believe they are equivalent to URIs.

Yes, they are valid URIs, but what I meant was that if I want to specify a dependency, I have a specific provider/namespace, name and usually also version in my mind. I'm not sure, if for official packages postponing of the decision from which provider to download the particular dependency should really be made first on the end-user side. I'd prefer to point to our Dao official repositories directly (e.g. by including a namespace or even using a full URL like http://daovm.net/hub/dao-officialdaonamespace-cat00-mod00-stable.2.0).

It is better to let the packaging tool to generate or update the package description files.

Yes. Still, there'll be a need to work with existing description files (hand-made or generated by something like fpm or make) and the packaging tools have to support it.

dumblob avatar Sep 13 '14 16:09 dumblob

Btw the size of the git repository is really big. Couldn't that be caused by the synchronization fossil->git?

dumblob avatar Sep 13 '14 16:09 dumblob

And what about knowing what everything WinAPI allows in the domain of DLL versioning? :)

The largest unique feature of Windows DLL management is described here :)

The packaging tool can check the repositories for such tags, and create one package description file per each tag, automatically from the tag name, and the dependency file plus the building instruction file;

I think it's simpler to have single package file which contains information on all tags and versions of this package, namely because there is no benefit from fragmenting the meta-data which will likely be very simple and lightweight anyway. Basically, nothing but revision ID is minimally required to be associated with each tag.

Overall, it seems like we've more or less formed the principles behind the package management. The only issue I see is reliance on fossil, which essentially excludes any chance for someone to host a repository at GitHub. But I suppose this problem can be attended if/when the necessity arises.

Night-walker avatar Sep 13 '14 17:09 Night-walker

which essentially excludes any chance for someone to host a repository at GitHub.

I'm not convinced about this as long as we retain URIs with schema, i.e. with git: in front of each item in the sources array. The package tool can support any number of types of repositories - it's not difficult to implement (imagine a simple unified interface and then a class for each type of repository).

dumblob avatar Sep 13 '14 17:09 dumblob

Btw nice article about another type of hell (hey, I'm getting more and more afraid to die :D) - it seems significantly more complicated than supporting both 12 years old Linux systems along with 0.5 years old ones :)

dumblob avatar Sep 13 '14 17:09 dumblob

I'm not convinced about this as long as we retain URIs with schema, i.e. with git: in front of each item in the sources array. The package tool can support any number of types of repositories - it's not difficult to implement (imagine a simple unified interface and then a class for each type of repository).

If it's just a matter of running the proper shell command, then yes, it's indeed simple.

Night-walker avatar Sep 13 '14 17:09 Night-walker

If it's just a matter of running the proper shell command, then yes, it's indeed simple.

It's imho like that or very close to it.

dumblob avatar Sep 13 '14 18:09 dumblob