Knut Jägersberg
Knut Jägersberg
Related to your project, because you started out with chain-of-thoughts fine tuning: Researchers alpaca finetuned Galactica, Galpaca, which seems to have better reasoning in science and technological domains than llama:...
This is so insanely fast moving, I get confused. https://github.com/databrickslabs/dolly/tree/master/data
Author description (not mine): "CAMEL datasets:PhysicsChemistry and Biology. Each dataset contains 20K problem-solution pairs, consisting of 25 topics, 25 subtopics and 32 problems for each "topic, subtopic" pair generated and...
https://github.com/DreamerGPT/DreamerGPT/tree/main/data
also check out https://github.com/FreedomIntelligence/InstructionZoo
would be great to have!