PkgTemplates.jl icon indicating copy to clipboard operation
PkgTemplates.jl copied to clipboard

check if name is similar to existing package?

Open ericphanson opened this issue 4 years ago • 6 comments

One can use

using RegistryCI
using RegistryCI.AutoMerge
function is_far_from_existing_names(name)
    path_to_registry = joinpath(DEPOT_PATH[1], "registries", "General")
    all_pkg_names = AutoMerge.get_all_non_jll_package_names(path_to_registry)
    return AutoMerge.meets_distance_check(name, all_pkg_names)
end

to check if a proposed package name will pass AutoMerge's check about package names (as documented in the General registry's README, https://github.com/JuliaRegistries/General#automatic-merging-of-pull-requests).

I thought it could be helpful if PkgTemplates (perhaps optionally) runs this check at package-creation time. It could also support multiple registries (although obviously AutoMerge for General only looks at General, it can still be good to know if there are similar names in another registry you are using), e.g.

using RegistryCI
using RegistryCI.AutoMerge
function is_far_from_existing_names(name)
    names = String[]
    for registry in readdir(joinpath(DEPOT_PATH[1], "registries"); join=true)
        append!(names, AutoMerge.get_all_non_jll_package_names(registry))
    end
    return AutoMerge.meets_distance_check(name, names)
end

(Perhaps we should add a real API for this in RegistryCI though, if it's going to be used programatically)

ericphanson avatar May 09 '21 20:05 ericphanson

I think this would be a good addition! On first thought, id say this check should always run (assuming it's fast enough) but only be a warning.

Also I'd prefer RegistryCI have an API for this, so we can guarantee to always do what that does.

nickrobinson251 avatar May 09 '21 20:05 nickrobinson251

I'll note that having RegistryCI as a dependency of this package is rather unappealing... but I do like the idea.

christopher-dG avatar May 10 '21 01:05 christopher-dG

RegistryCI is a very heavy package with a lot of dependencies.

@ericphanson Maybe we could refactor the name-checking functionality out into a small package, e.g. JuliaRegistries/NameSquattingChecker.jl. Then RegistryCI.jl and PkgTemplates.jl would both depend on NameSquattingChecker.jl.

DilumAluthge avatar May 10 '21 01:05 DilumAluthge

Good point about the dependency. The name checking code actually isn’t super light because it uses various not-totally-trivial distance measures. The code in RegistryCI uses VisualStringDistances (which uses UnbalancedOptimalTransport and StaticArrays, but those only have stdlibs deps themselves) and StringDistances (which uses Distances and StatsAPI, and those just use each other and stdlibs). So even just the name checking code is 6 total dependencies + stdlibs. So I totally get if that’s too many for PkgTemplates.

Also, while it’s pretty fast, it’s not instant; it can take a couple seconds which is a noticeable lag for interactive stuff. That’s because of the visual distance. (Every new package registration involves solving 100s of optimization problems comparing pixelated grids representing the names, in pure Julia, in CI, since it was introduced like 6 months ago, with no issues so far 😎)

ericphanson avatar May 10 '21 02:05 ericphanson

You know what could be cool? What if JuliaHub had a feature in which you put in a prospective package name, and then JuliaHub runs the name checking code for you? So then we don't need to add any dependencies to PkgTemplates.

@StefanKarpinski @viralbshah

DilumAluthge avatar May 10 '21 02:05 DilumAluthge

This could go into #360 ?

gdalle avatar Jul 31 '23 21:07 gdalle