gitea icon indicating copy to clipboard operation
gitea copied to clipboard

Automatically connect source code repository to package on upload operation

Open d1nch8g opened this issue 1 year ago • 17 comments

PR contains corrections to make automatic connection of repository with source code to packages when they are uploaded, in case a user or organization has a repository with a name matching the package name.

Possible scenario: The user has a repository called "gitea", when the user uploads a package called "gitea" to the container registry, package property will be automatically connected to repository with the source code.

image

d1nch8g avatar Sep 27 '23 23:09 d1nch8g

While this approach may work I don't think we should use it. Most package types have a dedicated field in the metadata which can contain an url to a repository. We should use that and I think there is a (draft/closed?) PR already.

KN4CK3R avatar Oct 08 '23 19:10 KN4CK3R

@KN4CK3R

This approach should be more unified since different Gitea instances might have repositories with specific projects and related packages, which might not necessarily point to a repository on that concrete instance.

Some examples of Arch Linux packages:

~/iso main !3 > pack -Qi zlib
Name            : zlib
Version         : 1:1.3-1
Description     : Compression library implementing the deflate compression method found in gzip and PKZIP
Architecture    : x86_64
URL             : https://www.zlib.net/
~/iso main !3 > pack -Qi zsh 
Name            : zsh
Version         : 5.9-4
Description     : A very advanced and programmable command interpreter (shell) for UNIX
Architecture    : x86_64
URL             : https://www.zsh.org/
~/iso main !3 > pack -Qi gtk3              
Name            : gtk3
Version         : 1:3.24.38-1
Description     : GObject-based multi-platform GUI toolkit
Architecture    : x86_64
URL             : https://www.gtk.org/

Current approach should be applicable to situations where different instances have copies of the same project and provide it via package registries.

ghost avatar Oct 08 '23 21:10 ghost

I agree with @KN4CK3R that most package types have specific field for this and that should be used instead of guessing by name

lafriks avatar Oct 08 '23 21:10 lafriks

@lafriks

I wanted to clarify a couple details.

  • The operation is executed only on new uploads.
  • If a git repository with a matched name is not found, it won't stop the upload operation.
  • Connected repository does not affect the 'Project website' link in the package description; it remains the same.
if created { // Only new package uploads
	if _, err := packages_model.InsertProperty(ctx, packages_model.PropertyTypePackage, p.ID, container_module.PropertyRepository, strings.ToLower(pi.Owner.LowerName+"/"+pi.Name)); err != nil {
		log.Error("Error setting package property: %v", err)
		return err
	}

	repository, err := repo_model.GetRepositoryByOwnerAndName(
		ctx, pi.Owner.Name, p.Name,
	)
	if err == nil { // If repository is not found, it won't cancel upload
		err = packages_model.SetRepositoryLink(ctx, p.ID, repository.ID)
		if err != nil {
			log.Error("Error linking source code repo to container: %v", err)
			return err
		}
	}	
}

Example gitea arch package:

Screenshot from 2023-10-08 19-49-11

Link to repository in gitea instance and project website might differ, but automatic repository connection on package upload operation shouldn't be a problem.

ghost avatar Oct 08 '23 22:10 ghost

@lafriks @KN4CK3R

I can rewrite it in such a way that the repository will be automatically connected only if the package metadata URL points to the git repository that exists in the gitea instance where the package is uploaded.

But that approach is not generic for all package types (since the URL field differs across packages). Some packages don't have a project URL or specify it differently. Also, this approach will not work for project forks on the same instance (when users upload packages to forked repositories), since the URL in the metadata will be the same and the repository won't be connected by package name.

Another solution might be to perform two checks (first by project URL from metadata and second by package name) for repository connection, but it still won't work with package uploads to forked repositories.

Won't take much time to create new draft, which approach would be better?

ghost avatar Oct 09 '23 02:10 ghost

@lafriks @KN4CK3R

I can rewrite it in such a way that the repository will be automatically connected only if the package metadata URL points to the git repository that exists in the gitea instance where the package is uploaded.

But that approach is not generic for all package types (since the URL field differs across packages). Some packages don't have a project URL or specify it differently. Also, this approach will not work for project forks on the same instance (when users upload packages to forked repositories), since the URL in the metadata will be the same and the repository won't be connected by package name.

Another solution might be to perform two checks (first by project URL from metadata and second by package name) for repository connection, but it still won't work with package uploads to forked repositories.

Won't take much time to create new draft, which approach would be better?

If it's an external package, it's right to not link to a repository in this Gitea instance.

lunny avatar Oct 09 '23 05:10 lunny

@lunny

Some cases will remain uncovered with a different approach.

  1. Automatically connect uploaded package to related forked repository

That might be handy to share modified versions of built software in packaged format.

  1. Automatically connect packages which don't specify project URL (containers for example)

Most of the time, Docker containers are built without labels for source, license, URL, or description. But it is always possible to connect them using image tags or package names.

  1. Automatically connect packages which specify project webpage instead of git URL

None of the packages that specify a webpage or documentation URL as the project homepage would be connected automatically. Also, github/gitlab packages won't be connected.


Users would always be able to connect the repository to the package manually in the UI. Automating this process will make it simpler.

ghost avatar Oct 09 '23 07:10 ghost

For different packages there will be different places to look for metadata:

  • container images will have org.opencontainers.image.source that will point to source code repository
  • npm package.json will have repository field etc

These should be used to automatically link to repository

lafriks avatar Oct 09 '23 20:10 lafriks

@lafriks

Packages might provide the following properties:

  1. Origin repository: always points to the git repository, the main repository for the project. (Might not be possible to create SQL relation in gitea's database, but link can always be provided.)
  2. Project name or tag—which might be related to the repository in Gitea, if such exists. (Project name or tag has bigger chance to create SQL relation in db in specific user scope, users can only bind packages to repositories they have access to.)
  3. Project homepage: points to the project website, might not be related to the git repository.

I would suggest following solution. It should be possible to specify all options; it may look like this:

Screenshot from 2023-10-10 04-44-40

It would provide information about the source code origin, official project website and ability to create SQL relation between package and repository in Gitea instance automatically.

ghost avatar Oct 10 '23 08:10 ghost

@a1012112796

Is it accurate to say that putting up an automatic repository connection as an option with an extra flag in the user settings is the plan?

ghost avatar Nov 08 '23 00:11 ghost

First step should definitely be this as this is the most common case, than we can improve upon this for edge cases

For different packages there will be different places to look for metadata:

container images will have org.opencontainers.image.source that will point to source code repository
npm package.json will have repository field
etc

These should be used to automatically link to repository

lafriks avatar Nov 08 '23 12:11 lafriks

@lafriks

I got 2 implementations. First, this PR should work for these cases. Second is in this branch: it requires modification of package file creation parameters and methods, alongside error processing on the top level (it depends on this PR, also i am not sure about the http response code when user does not have permission for repository packages), but should result in less code added to the final version and will allow transaction rollback in case an error happens while connecting to the repository.

ghost avatar Nov 10 '23 15:11 ghost

I proposed a different solution to this that would not require the package and repo name to match: https://github.com/go-gitea/gitea/issues/28808

Simply the api endpoint would contain the repo name: https://gitea.example.com/api/packages/{owner}/{repo}/maven

I'm not sure if this would be harder to implement since I'm unfamiliar with the gitea codebase atm.

Renari avatar Jan 15 '24 20:01 Renari

@Renari

I completely agree with the proposed approach, except for changing the link format.

It seems to me that to reduce the amount of logic, it might be better to make a unified solution that could work for all built-in registries. An approach with headers should allow this.

ghost avatar Jan 18 '24 10:01 ghost

https://github.com/go-gitea/gitea/pull/23851

Also adds the ability to link via an API endpoint, while not the same as having it link automatically you would be able to do the links programmatically, I prefer this solution instead of making 2 API requests however.

Renari avatar Jan 18 '24 13:01 Renari

@KN4CK3R @lafriks @lunny @Renari

Changed the implementation to use request headers for repository connections instead of metadata fields. This allows to keep the same approach for repository connection for all package registries and wouldn't break compatibility with existing API endpoints.

ghost avatar Jan 22 '24 16:01 ghost

I don't quite get why api would have to be changed for automatic mapping using package metadata?

lafriks avatar Jan 22 '24 18:01 lafriks