Package hierarchy: should benchmarks, end_to_end, pedagogical_examples be under MaxText?
I was looking at some recent commits by bvandermoon and got to thinking, what is the correct hierarchy for MaxText?
What is installable, and what should be installable?
E.g., everything with:
0. python3 -m pip install MaxText
python3 -m pip install MaxText-benchmarksfor just benchmarks
(as an aside, PEP8 @ https://peps.python.org/pep-0008/#package-and-module-names says MaxText should be renamed maxtext)
Finally I reference end_to_end as a candidate for a complete rewrite in Python. Not difficult to do, and will improve maintenance quality, portability, and extensibility. Where should end_to_end sit on the hierarchy? python3 -m pip install MaxText-end2end? - Or should we hoist everything into MaxText?
Commentary welcome 🤗
@bvandermoon could you please take a look
Thanks for bringing this up @SamuelMarks. A couple questions/comments:
- For the end_to_end piece, what would be the benefit of a Python rewrite?
- Benchmarks could be interesting to split out. Since it handles scheduling MaxText to run on the right hardware, and the dependencies come from a Docker image. So it is burdensome to have to install all MaxText dependencies
- Honestly the reuse, portability, and modularity advantage. And yes, you can do this all in shell—like I did in libscript—but you'll need to majorly rewrite the shell scripts to the point where a Python rewrite would be roughly the same work
- Also what I said a fortnight ago #1606#discussion_r2056721174:
I would also like to remove all shell scripts and replace them with Python scripts. So having all parameterisation known in one location—well two for the test-specific one—and statically extendable would be especially useful for documentation; not to mention for fairly trivial: automated typed-CLI extrapolation;
- Great. Did you want to split it up but keep the monorepo structure and just have multiple
setup.pys?
Happy to do either/both/none of these contributions^