Outline of docs structure
Opening for discussion. We recently finished moving all the remote info from the command reference to distinct user guides. I think it's way more organized now, but I worry that it's too deeply nested inside user-guide->data-management->remote storage.
We know that this is some of the most frequently visited info, and so is user-guide->project structure->dvc.yaml files. Can we create a top-level reference section that includes remotes, project structure, CLI, and Python references? I think it would help make these pages easier to find and create a cleaner separation between more narrative guides and technical reference material.
Proposed structure would look like:
- User guide
- Data management
- Pipelines
- Experiment management ...
- Reference
- Project structure
- Remote config
- Commands
- Python API
Sounds good to me, @dberenbaum !
Project structure should become more like a Project file or something? I don't like this name, but can't come up with something better ... dvc.yaml reference?
Remote config - I wonder if we should do one Config reference - include remotes there (w/o additional level, all things about remotes and config on the same level).
Project structureshould become more like a Project file or something? I don't like this name, but can't come up with something better ...dvc.yamlreference?
Agreed, I think we need to make it better than project structure, which sounds more like we are giving suggestions on how to organize your project. Right now, it includes more than dvc.yaml but also any other DVC-managed files.
Remote config- I wonder if we should do one Config reference - include remotes there (w/o additional level, all things about remotes and config on the same level).
Makes sense. Would we want subpages for each remote like we now have in the guide? Then it becomes as nested as it is now, so now I'm second guessing this π .
I think we can keep it:
Reference
Config
SSH
Google Drive
...
Cache
[Other config ....]
wdyt?
It could work, or it might be too busy.
Let's put this as a p2 for now. We have already moved this info around a lot. I'd rather focus on get started and new content focused on basic workflows than on moving around the existing content.
Another benefit would be to have somewhere to mention environment variables
Bumping the importance here but also want to consider other changes to consolidate the dvc/mlops docs (dvc, dvclive, studio, vs code)
Prioritizing Integrations
Integrations is a section with its own importance in docs in B , thing that has proven important when including a tool in a fast-track scenario. In DVC, I have to go to
DVCLive > ML Frameworksπ²πΎπ½π²π»πππΈπΎπ½: πππ πππππππππππ ππππππππππ ππ πππ ππππππππππ ππ ππππππππ£π ππππππππππππ ππ ππππ ππ πππ ππππ-πππππ ππππππππ, πππ πππππππππ£π ππππππ πππππππππππ’.
Originally posted by @SoyGema in https://github.com/iterative/dvc.org/issues/4919#issuecomment-1762823346
Consider moving integrations to a top-level section if we do this reorg
Combining with the ideas in https://github.com/iterative/dvc.org/issues/5153, we can use this ticket to discuss the overall structure of the docs. Here's a proposal of an ideal state to work towards:
- Get started
- Data pipelines
- Experiment tracking
- Model registry
- Use cases (still not sure where to put these; should they be outside docs? before get started?)
- Versioning data and models
- CI/CD for machine learning (generalize beyond CI/CD to automated ML workflows? update to reference studio cloud compute?)
- Fast and secure data caching hub (analytics show its unpopular; drop/rename/replace in the future with use case about using studio for secure data access?)
- Experiment tracking
- Model registry
- Data registry
- User guide
- Data
- Pipelines
- Experiments
- Models
- Reference
- Studio UI
- Studio REST API
- DVC Configuration
- DVC Commands
- DVC Python API
- DVCLive Python API
- GTO Commands