jan icon indicating copy to clipboard operation
jan copied to clipboard

epic: Better files & links

Open freelerobot opened this issue 1 year ago • 8 comments

Motivation

  • Users (esp Windows) have discrete preferences for how to store/share assets
  • e.g. model binaries, model configs, in the future, assistant tar.gzs
  • Placing app files in home directory messes up ppl's dev envs #1358

Community request

  • #1358

Specs

Users can change default Jan app data location

  • Give users control over how their own fs is used
  • Good UX around error handling, interrupted migrations

Users can restore to a clean build

  • If users mess up the /jan folder structure, it should be recoverable

Users can thoroughly delete Jan & purge data

  • No dangling files.
  • No running various sh scripts to clean up data.

Open Questions

  1. Where, what, and why are we currently storing data?

MacOS

~/jan : janroot ~/Library/Application Support/jan : Cookie, session for electron app (chromium) ~/Library/Caches/jan-updater, ~/Library/Caches/jan.ai.app, ~/Library/Caches/jan.ai.app.ShipIt : Cache

Window

C:\User\%USERPROFILE%\jan : janroot C:\Users\%USERPROFILE%\AppData\Local\Programs\jan : app installed folder, will be remove if we uninstall app C:\Users\%USERPROFILE%\AppData\Roaming\jan : cache folder for jan app C:\Users\%USERPROFILE%\AppData\Local\electron : cache folder generated by electron - chromium related

Linux:

~/jan: janroot ~/.config/jan: Cache ~/.npm/_cacache: Folder cache generated on linux when call a function getTempCache in typescript

  1. Where should this data go?
  • User data & Jan specific assets, e.g. the model/assistant/thread.jsons
  • User data & shared assets, e.g. model binaries, RAG files (which could be shared across other apps)
  • Application assets, e.g. cache, logs, and other app data

3. What should the reset & deletion behavior be for these files?

  • How does it vary across Systems? Mac, Windows, Linux.

Tasklist

Design

  • [x] #1010
  • [x] #1052

Implementation

  • [ ] #1435
  • [x] #1618
  • [x] #1619
  • [ ] #1621
  • [ ] #1620

Not in Scope

  • Letting users change Jan specific subfolder paths for /models, /assistants, rag attachments, that contain the jsons
  • Exclude files, block Jan from accessing certain paths
  • A full Obsidian vault inspired UX: https://help.obsidian.md/Files+and+folders/Manage+vaults
  • Everything else. Let's scope this tightly pls.

Appendix

I'm inspired by Obsidian's philosophy of letting users manage how their filesystem is used. Sidenote (nonurgent): We should strive to evolve our SDK/fs wrapper towards this level of user ownership and flexibility.

image

freelerobot avatar Jan 10 '24 05:01 freelerobot

@hiento09 to list down the app structure

imtuyethan avatar Jan 15 '24 07:01 imtuyethan

I have scanned my windows, below are folders which are generated by jan app on windows: C:\User\%USERPROFILE%\jan : janroot C:\Users\%USERPROFILE%\AppData\Local\Programs\jan : app installed folder, will be remove if we uninstall app C:\Users\%USERPROFILE%\AppData\Roaming\jan : cache folder for jan app C:\Users\%USERPROFILE%\AppData\Local\electron : cache folder generated by electron - chromium related

hiento09 avatar Jan 16 '24 04:01 hiento09

Macos: ~/jan : janroot ~/Library/Application Support/jan : Cookie, session for electron app (chromium) ~/Library/Caches/jan-updater, ~/Library/Caches/jan.ai.app, ~/Library/Caches/jan.ai.app.ShipIt : Cache

hiento09 avatar Jan 16 '24 06:01 hiento09

Linux: ~/jan: janroot ~/.config/jan: Cache ~/.npm/_cacache: Folder cache generated on linux when call a function getTempCache in typescript

hiento09 avatar Jan 16 '24 06:01 hiento09

Linux: ~/jan: janroot ~/.config/jan: Cache ~/.npm/_cacache: Folder cache generated on linux when call a function getTempCache in typescript

Note for Linux: (at least in Debian-based distributions), rather than ~/jan apps would usually store their default profile folder (i.e. "janroot") either in ~/.config/jan (in that case it might give ~/.config/jan or ~/.config/jan/profile for profile folder and ~/.config/jan/cache for cache) or for a few of them in a hidden folder at root ~/.jan (note the dot before the folder name that makes it hidden by default on Linux systems). Non-hidden folders in the home folder would mostly be User data folders (Documents, Pictures, Videos, etc.). Finding ~/jan here is thus not conventional, however one could imagine the models folder to be like user data (I guess it would only make sense if those models can be shared among various applications), i.e. split from jan folder as such: ~/IA Models, ~/.jan (or ~/.config/jan), ~/.config/jan/cache and ~/.npm/_cacache.

oleole39 avatar Jan 16 '24 14:01 oleole39

Pass comments from Nicole.


Consider the scenarios:

A) 1542: Scanning an entire folder for multiple files User scans /random_path/models, which contains LlamaCorn.gguf (and other models) LlamaCorn (and other models) show up in the app, yay Cmiiw: at this point, the corresponding model.json is autogenerated and placed in Jan App Directory Users can edit this App Dir > model.json to change default Jan configs for LlamaCorn

B) 1382: Configuring 1 specific file (which is actually a subscenario of A) Rather than scanning an entire folder, user wants to explicitly add only 1 model in /random_path/model to Jan In this case, user creates a model.json in Jan App Directory, and sets the correct source_url / reference / symlink LlamaCorn shows up in the app, yay

Note: the modelfile in A.3 is the same as B2.

So the 2 issues likely share implementation detail, but are very different use cases and user flows. We should allow this level of flexibility & freedom when it comes to folder mgmt

imtuyethan avatar Jan 16 '24 14:01 imtuyethan

Jan should never create ~/jan, nor should it ever create ~/.jan, as this defeats the purpose of this discussion. (satisfying user preferences and platform conventions)

Jan could test if the environment variable $JAN_HOME $XDG_DATA_HOME is set, if not, fall back to ~/.local/share. Then, check for janroot, and if it does not exist, create it. It could do the same with other platforms' environment variables. This could go in startup code that could run before anything else. If setting a custom location in the gui is preferred, jan could place a file or symlink in its config directory that points to where to look first. If that target folder does not exist, fall back to the environment variable methods.

REALERvolker1 avatar Jan 16 '24 16:01 REALERvolker1

+1 on not using ~/jan or ~/.jan on Linux. Very, very, very bad practice.

JamesMowery avatar Jan 22 '24 20:01 JamesMowery

Everything is on main now & ready to be released tomorrow yayayay, good work everyone!!

imtuyethan avatar Jan 31 '24 07:01 imtuyethan