Add more developer documentation
What is the problem the feature request solves?
I would like to have more documentation around the project. I was planning of creating a sphinx documentation as a starting point since its the same used in datafusion as well (https://github.com/apache/arrow-datafusion/tree/main/docs/source). Later on I would like to expand that to how rapids provides it (https://nvidia.github.io/spark-rapids/developer-overview/) . Let me know your thoughts and if you are okay with me creating a skeleton documentation to start of first ?
Describe the potential solution
No response
Additional context
No response
I am also interested in contributing to the developer documentation as I get up to speed on this project.
I don't have a strong preference for sphinx or any other format but would be interested to see what others think.
I went ahead and created https://github.com/apache/datafusion-comet/pull/314 to set up a minimal site, using the same scripts and tools as DataFusion. It would be great to have some PRs to add content to this site.
Here are my thoughts on some good areas to contribute:
- What is Comet?
- How does it work?
- Benchmark Results
- Compatibility Guide
- Configuration Guide
- What is supported?
- Data Types
- Expressions
- Operators
- File Formats
- Roadmap
I would like to add a section like Installation or something like that after "What is Comet?"
We should also explain how the shims work for different Spark versions
Documentation has improved a lot over the past year, so I think we can close this issue now and open more specific issues for extra documentation that is desired.
Thanks for this @andygrove , compared to the time I raised the PR and now, I see a lot of much needed information is added.