igraphdata icon indicating copy to clipboard operation
igraphdata copied to clipboard

All data must be under the same license

Open krlmlr opened this issue 8 months ago • 9 comments

Step 1: find out if we can remove graphs without breaking downstream.

https://github.com/search?q=org%3Acran+%28repo%3Acran%2FclustAnalytics+OR+repo%3Acran%2Fdodgr+OR+repo%3Acran%2Fhydra+OR+repo%3Acran%2Fig.degree.betweenness+OR+repo%3Acran%2Figraph+OR+repo%3Acran%2Finterplex+OR+repo%3Acran%2Finvertiforms+OR+repo%3Acran%2Fmultinets+OR+repo%3Acran%2Fnetplot+OR+repo%3Acran%2FrSpectral+OR+repo%3Acran%2Fsand+OR+repo%3Acran%2FScorePlus+OR+repo%3Acran%2Fvsp%29+%2Fenron%7Ckarate%7Cyeast%7Cukfaculty%7Ckite%7Cmacaco%7Ckoenigsberg%7Cusairports%7Cdolphins%7Cimmuno%7Cfoodweb%7Cpolblogs%2F+&type=code

R code to construct the query:

# Load necessary package
library(tools)

# Define the datasets from igraphdata
igraph_datasets <- c("enron", "karate", "yeast", "ukfaculty", "kite",
                     "macaco", "koenigsberg", "usairports", "dolphins",
                     "immuno", "foodweb", "polblogs")

# Identify reverse dependencies of igraphdata
igraphdata_dependencies <- package_dependencies("igraphdata", reverse = TRUE, which = "all")
igraphdata_packages <- unique(unlist(igraphdata_dependencies))

# Construct the GitHub search query using regex search
dataset_query <- paste0("/", paste(igraph_datasets, collapse = "|"), "/")
repo_query <- paste(paste0("repo:cran/", igraphdata_packages), collapse = " OR ")

github_query <- paste0("org:cran (", repo_query, ") ", dataset_query)

# Print the query
cat("GitHub Search Query:\n", github_query, "\n")

# Check for length limit
max_query_length <- 2048  # GitHub's max URL length
if (nchar(github_query) > max_query_length) {
  cat("\nWARNING: Query exceeds GitHub's search limit. Consider splitting it.\n")
}

Step 2: Evaluate next steps

TBD.

krlmlr avatar Mar 13 '25 10:03 krlmlr