dark icon indicating copy to clipboard operation
dark copied to clipboard

Component: Package Manager

Open StachuDotNet opened this issue 3 years ago • 2 comments
trafficstars

This issue organizes work around our package manager, and how we interact with it.

This is in a somewhat rapidly-changing space, so external contribution may be difficult at this time.

  • [ ] Track/compute whether user functions are pure
    • Once we can determine purity, we can remove the need for toplevels/fns using the pure fns from requiring play buttons to be clicked for relevant evals.
    • [ ] pre-step: when adding fns, we need a way of gathering what builtins are being used
    • [ ] and what other package fns are being used
    • [ ] and to compute the purity based on those
    • [ ] that dependency graph should be generally represented and stored there
  • [ ] figure out hashing and IDs
    • [ ] content-addressable
    • [ ] we maintain a chain of the parents
    • [ ] sandbox and GC for (unused, unreferenced, old, ?)’d package items
    • [ ] figure out (UX of) branches and PRs as part of this
  • [ ] package caching, perf

Notes that need a bit more processing:

  • [ ] caching
    • [ ] cache in memory, LRU ~50MB
    • [ ] NOT INITIAL VERSION cli could cache in ~/.darklang
    • [ ] could we use sqlite instead of the FS? would that be easier or harder?
    • [ ] add in-memory caching
    • [ ] add filesystem caching
  • [ ] long life caching headers
  • [ ] see if we can put a CDN in front of it
    • [ ] maybe via new CDN Dark package (probably something vendor-specific)

StachuDotNet avatar Jun 21 '22 14:06 StachuDotNet

Copying a comment from #4162 (Untangle ImpurePreviewable") from Paul

Dark has a concept of purity for built-in functions:

type Pure = 
  Pure
  | Impure
  | ImpurePreviewable

The intent of Pure functions is that we don't need to store their analysis results, as we can recalculate them on the fly. Impure functions we can't however, as we might get a different result.

ImpurePreviewable is a set of functions where they do not have side-effects, but they also don't return the same result every time. Date::now is a good example. This is a useful annotation as it could allow us to automatically execute the function when a function is added. This is all good, but the name ImpurePreviewable isn't the greatest name for this, perhaps we can come up with something better.

There is another related category which should be factored into this, which is functions which are pure or are impurePreviewable, but do not have a client side implementation available (due to something missing in blazor usually). We should probably mark them separately from Pure/Impure/ImpurePreviewable, either with another boolean flag (probably) or another category. Then we should think about how to use that info to give a better experience to users using these functions (perhaps they could execute automatically using the server if they're pure or previewable?)

StachuDotNet avatar Jan 14 '24 23:01 StachuDotNet

I've been focused on the PM lately, and wanted to gather some thoughts. This comment is a WIP.

The package manager needs some work, towards something that is stable, write-only, syncable, fast, ready for editing (paired with our language server), and offline-ready (for local installs).

Here's the desired end result, as I see it:

  • all name-resolution is done as parse-time
  • (no name-resolution happens after we're in ProgramTypes, incl in the Interpreter)
  • all references (in PT, RT) to package items are via identifiers to package items (for now, just tlids. later, hashes of the contents of those package items)
    • so, both PT and RT should be updated to reflect this
      • e.g., related to PT Package Types...
        • PT.FQTypeName.Package is updated
          • from
            • type Package = { owner : string; modules : List<string>; name : string }
            • | Package of Package
          • to | Package of tlid
        • the old PT.FQTypeName.Package structure is moved to the PT.PackageType module, as type Name
      • same for the other things in PT, RT, ST, etc.
  • instead of one, we have two separate Package Manager concepts:
    • the run-time package-manager ("RT PM") used by the Interpreter/Execution
      • takes an ID (and/or future, hash) and returns RT.PackageType, .PackageConst, or .PackageFn
      • this just needs to return the implementation - no name, or metadata stuff
      • can be really fast with a thin cache - it's not doing much
    • the program-time package-manager ("PT PM")
      • takes an ID (and/or future, hash) and returns PT.PackageType, .PackageConst, or .PackageFn
      • takes a name and returns PT.PackageType, .PackageConst, or .PackageFn
  • the NameResolver should live on-top of the dev-time (PT) PackageManager
  • when errors are encountered at run-time (incl if somehow) we can't find an item, or if we otherwise need to display the name of some running/ran function, then we take the ID/hash that's been used, and fetch the name from the PT PM

Phases of work to do:

  • [ ] tidy things; remove version #s; use Builtins instead of the "Entries" DB (PR #TODO)
  • [ ] create the PT PM
  • [ ] ? update the toStrings of PT and RT to be mutable, set in the CLI and Cloud runtimes (see bullet in below section)
  • [ ] do name-resolution all at parse-time; refactor PT, RT (etc) to reference things accordingly
  • [ ] add hashes of things to the DB and elsewhere
  • [ ] use those hashes
  • [ ] embed Sqlite in the Cli exe
  • [ ] set up CLI PM; cache in Sqlite DB(s)

Some Questions, TODOs, and Hmms:

  • (biggest blocker) deal with: toStringing of package names, at run-time.
    • we could do this in a terrible way: mutable toString -ers at top of PT, RT (a few id -> strings)
    • a few options
      • do a lot of work to make RTEs populate the hashes, and do toString-ing always in Dark
      • magical mutable fns at the top ot RT
      • pass in some id -> string stuff in executionState, which (secretly) uses the PT PM
      • hmm, the RT PM could have some getNameForType (id: tlid): strng kinda fns, which (agani, secretly) use the PT PM
  • naming things is hard...
    • so far, FQTypeName's name has made sense - it was either Builtin {name, version} or Package {owner, modules, name, version}.
    • but now, when it references a Package type, it'll be just an ID. That's not quite a name any more, is it?
    • should we rename it to something like FQTypeReference or CustomTypeReference?
  • regarding the NameResolver
    • right now we only return exact, full matches. (how) should we satisfy near-matches, to provide suggestions to the user?
    • right now we only return one match, if any. it feels useful to potentially return multiple matches, in the case that they exist (so devs can help 'clarify' in the editor)
  • we need to deal with bulk-fetching of packages (fetch the dependencies, not just the thing we asked for)
    • and then using some metrics to only fetch the dependencies we'll really need
    • maybe we have two sets of endpoints: type/by-name/:name and type/by-name/:name/with-dependencies (the latter returning some {types,consts,fns} collection)
  • should we have another set of tables to store the PT PM and the RT PM? (I think just expanding upon the existing package_type_v0 type tables, not adding new ones, but open to hearing otherwise)
  • should we enable/encourage access/exploration of the PM .sqlite DBs? I can't think of a need, but it fits the OSS model
  • does dark-packages eventually get renamed (back) to dark-editor?
    • I suspect the lines between the PM and other things will blend together a bit
    • in the shorter term, name-resolution should be available over HTTP (I think?), and that's not quite "PM"
  • the Language Server (incl in-browser) probably also needs some form of a local copy of the PM
  • we need caching in dark-packages (rather than always fetching from DB)
  • TODO: locally, offline, CLI installs need a copy of both (for runtime and dev-time)
    • ok the dev-time one is optional I guess
    • deal with syncing
  • let's say I have a NAS where I want to store private packages for my local network - how?
    • do this without allowing other folks to host central PMs
  • need to figure out how this all actually integrates with... actually editing/developing code?
  • do we really need package names in RT at all? similarly, some of the details of RT.PackageType etc could be thrown out -- that stuff only needs to exist in PT. OK maybe that's not true - deprecated is needed. but description could be gone!

StachuDotNet avatar May 16 '24 13:05 StachuDotNet