Sam Stoelinga
Sam Stoelinga
Currently GCS and gcsfuse is used to store models and datasets. For local dev or potentially for higher performance Minio + GKE local SSD should be considered as an alternative.
Configure the data storage to be S3 compatible backend since an object storage is required for substratus anyway
Right now the container image registry is not configurable. This prevents it from being used in environments like AWS and also prevents companies that want to use their own registry...
``` Handling connection for 8888 Handling connection for 8888 Port forward error: an error occurred forwarding 8888 -> 8888: error forwarding port 8888 to pod 02eaee1eb6f2c4d135aaaf49657252 826b6e252961d9fe926103c2f5ea9c5a51, uid : failed...
https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k base models are often fine tuned on this dataset to be an instruct/chat focused model Note the format is quite different as well, seems some datasets are more trained
currently the training of output of huggingface trainer is an adapter layer and it doesn't save the full model. I think Basaran right now is only taking the original model...
I created a separate repo so I can use a public docker image here in the helm charts: https://github.com/substratusai/verba-docker However, ideally Verba publishes an official container image and tags it...
This is helpful in cases where there is variable step time and looking at the logs would quickly allow you to identify such cases.
do not merge yet, still working on it