notebooks
notebooks copied to clipboard
SageMaker EleutherAI Evaluation Harness example?
trafficstars
Hi, would it make sense to create a SageMaker batch processing example demonstrating running eval with Eleuther AI Language Model Evaluation Harness? I haven't seen any SM examples for LLM eval (could have just missed them though).
I see that the Open LLM Leaderboard allows submitting models for evaluation on GPU cluster, but many users want to run eval on models that they can't publicly upload.