What are the most important tutorials that should exist on our docs? Where should the rest of the tutorials go?
What is it?
Do some research and propose a list of data science tutorials using pyoso of some of the most common/sought out data techniques. Focus on being concise, interesting, and exhaustive, targeting users that are new to data science and looking to get started.
Per today's conversation, we should think about refactoring the tutorials page. The main idea involves defining the "core" or "most important" topics that will remain the tutorials shown on the oso docs page, and then directing everything else to the Colab community/insights repo.
I put together a PR where I refactored the tutorial page a little: https://github.com/opensource-observer/oso/pull/3888
When brainstorming what the "core" topics should be, I realized that it really depends on target user, so I created 4 light personas and gave each 3 "core" topics. Let me know what you think @ccerv1 @ryscheng
These changes should work nicely with the github action workflows I've pushed, as we move people towards adding to the colab community and insights repo.
Some fun ideas for tutorials to make down the line (these will exist in the Colab community) - I'll just save them here for now:
- Clustering (I can use the work I did for EF)
- Repo categorization w/ LLM
- Survival Analysis of OSS Projects
- Agent built on your repo (as a knowledge base) that you can chat with
- RL-Based Grant Allocation Simulator
- Sentiment Analysis (Issue Discussions, Farcaster, X, etc)
This is a good list!
Intro:
- Seeing how much funding a project has received from different sources
- Looking up projects that meet some set of heuristics (eg, all projects with 1-10 devs, deployments on X chains, etc)
Medium:
- Doing a basic synthetic control or statistical test for timeseries metrics
- Market share calculations
- Extracting data from some of our staging models (eg, more information about specific commits or dependency versions)
Advanced:
- Combining OSO data with AI jobs
- Creating a retro funding distribution algorithm