dvc.org icon indicating copy to clipboard operation
dvc.org copied to clipboard

Outline of docs structure

Open dberenbaum opened this issue 2 years ago β€’ 9 comments

Opening for discussion. We recently finished moving all the remote info from the command reference to distinct user guides. I think it's way more organized now, but I worry that it's too deeply nested inside user-guide->data-management->remote storage.

We know that this is some of the most frequently visited info, and so is user-guide->project structure->dvc.yaml files. Can we create a top-level reference section that includes remotes, project structure, CLI, and Python references? I think it would help make these pages easier to find and create a cleaner separation between more narrative guides and technical reference material.

Proposed structure would look like:

  • User guide
    • Data management
    • Pipelines
    • Experiment management ...
  • Reference
    • Project structure
    • Remote config
    • Commands
    • Python API

dberenbaum avatar Mar 11 '23 13:03 dberenbaum

Sounds good to me, @dberenbaum !

shcheklein avatar Mar 11 '23 18:03 shcheklein

Project structure should become more like a Project file or something? I don't like this name, but can't come up with something better ... dvc.yaml reference?

Remote config - I wonder if we should do one Config reference - include remotes there (w/o additional level, all things about remotes and config on the same level).

shcheklein avatar Mar 11 '23 18:03 shcheklein

Project structure should become more like a Project file or something? I don't like this name, but can't come up with something better ... dvc.yaml reference?

Agreed, I think we need to make it better than project structure, which sounds more like we are giving suggestions on how to organize your project. Right now, it includes more than dvc.yaml but also any other DVC-managed files.

Remote config - I wonder if we should do one Config reference - include remotes there (w/o additional level, all things about remotes and config on the same level).

Makes sense. Would we want subpages for each remote like we now have in the guide? Then it becomes as nested as it is now, so now I'm second guessing this πŸ˜… .

dberenbaum avatar Mar 11 '23 19:03 dberenbaum

I think we can keep it:

Reference
   Config
       SSH
       Google Drive
       ...
       Cache 
       [Other config ....]

wdyt?

shcheklein avatar Mar 11 '23 20:03 shcheklein

It could work, or it might be too busy.

Let's put this as a p2 for now. We have already moved this info around a lot. I'd rather focus on get started and new content focused on basic workflows than on moving around the existing content.

dberenbaum avatar Mar 13 '23 15:03 dberenbaum

Another benefit would be to have somewhere to mention environment variables

dberenbaum avatar Apr 13 '23 18:04 dberenbaum

Bumping the importance here but also want to consider other changes to consolidate the dvc/mlops docs (dvc, dvclive, studio, vs code)

dberenbaum avatar Jul 18 '23 18:07 dberenbaum

Prioritizing Integrations

Integrations is a section with its own importance in docs in B , thing that has proven important when including a tool in a fast-track scenario. In DVC, I have to go to DVCLive > ML Frameworks

π™²π™Ύπ™½π™²π™»πš„πš‚π™Έπ™Ύπ™½: πšπš‘πšŽ πšŒπš˜πš—πšπš›πš’πš‹πšžπšπš˜πš› πš›πšŽπšŒπš˜πš–πš–πšŽπš—πšπšœ 𝚝𝚘 πšπš‘πšŽ πš–πšŠπš’πš—πšπšŠπš’πš—πšŽπš› 𝚝𝚘 πš›πšŽπšŒπš˜πšπš—πš’πš£πšŽ πš’πš—πšπšŽπšπš›πšŠπšπš’πš˜πš—πšœ 𝚊𝚜 πš™πšŠπš›πš 𝚘𝚏 πšπš‘πšŽ 𝚏𝚊𝚜𝚝-πšπš›πšŠπšŒπš” πšœπšŒπšŽπš—πšŠπš›πš’πš˜, πšŠπš—πš πš™πš›πš’πš˜πš›πš’πšπš’πš£πšŽ 𝚊𝚌𝚌𝚎𝚜𝚜 πšŠπšŒπšŒπš˜πš›πšπš’πš—πšπš•πš’.

Originally posted by @SoyGema in https://github.com/iterative/dvc.org/issues/4919#issuecomment-1762823346

Consider moving integrations to a top-level section if we do this reorg

dberenbaum avatar Feb 09 '24 21:02 dberenbaum

Combining with the ideas in https://github.com/iterative/dvc.org/issues/5153, we can use this ticket to discuss the overall structure of the docs. Here's a proposal of an ideal state to work towards:

  • Get started
    • Data pipelines
    • Experiment tracking
    • Model registry
  • Use cases (still not sure where to put these; should they be outside docs? before get started?)
    • Versioning data and models
    • CI/CD for machine learning (generalize beyond CI/CD to automated ML workflows? update to reference studio cloud compute?)
    • Fast and secure data caching hub (analytics show its unpopular; drop/rename/replace in the future with use case about using studio for secure data access?)
    • Experiment tracking
    • Model registry
    • Data registry
  • User guide
    • Data
    • Pipelines
    • Experiments
    • Models
  • Reference
    • Studio UI
    • Studio REST API
    • DVC Configuration
    • DVC Commands
    • DVC Python API
    • DVCLive Python API
    • GTO Commands

dberenbaum avatar Feb 23 '24 21:02 dberenbaum