API to return list of supported models
Describe the solution you'd like Returns the most up to date list of models that have been validated on Automodel, including timestamp of most recent validation. Ensure that all Nemotron models are in the list of models to be validated.
Proposed coverage API/list
- Once a week we generate the trending 2000 models, then we run the test on these models
- Coverage list containing models supported and ranked by their popularity
Concern of API in fw repo:
- creates a lot of history
- github hosted website, somewhere else
AI:
- define popularity/trending algorithm
- figure out the location of the supported model API/list
@akoumpa
@HuiyingLi ,
Initially I was expecting this file to be several megabytes, but I looks like it's not the case, instead it's in the order of KB. Therefore, I think we can proceed with the PR that you had previously.
Regarding contents, I'm proposing the following format
{
"commit": <sha>,
"models": [
{
"id": "01-ai/Yi-1.5-34B",
"tasks": [
"sft": ["FSDP2", "TP", "PP", "CP"],
"peft": ["FSDP2", "TP", "PP", "CP"],
]
},
# ... (next model),
}
Compared to the previous format, the proposed includes:
- The commit that was used for testing
- A tasks field with "sft" and "peft" keys
- For each task, has a list of the supported parallelism configs, if a parallelism config is not supported, then it should be missing from the list.
RL would also benefit from this since when we update our submodule of automodel we can query some file to figure out everything automodel supports. Could be JSON or python. Example of python:
import importlib.util
path = "/opt/foobar/foobar/supported_models.py"
spec = importlib.util.spec_from_file_location("supported_models", path)
supported_models = importlib.util.module_from_spec(spec)
spec.loader.exec_module(supported_models)
print(supported_models.VERIFIED)
print(supported_models.SUPPORTED_BUT_NOT_VERIFIED)
JSON is good too b/c it's not tied to the automodel library so we don't have to install just to print this information. Ideally RL would use this in the conf.py when we build our sphinx docs to dynamically build a table.