llama
llama copied to clipboard
What is inside llama-2-70b consolidated.00.pth file and how do I read it?
I tried to print out the contents of the consolidated.00.pth
file using below lines of code.
import torch
model = torch.load('consolidated.00.pth')
print(model)
what it struck me was the content after printing like that, the size of the file is around 317kb where as the consolidated.00.pth file is close to 17.25gb. weights.txt
Are there other contents in the file ? how do I see it? Attached the exported content for reference.
My Thoughts before this question.
Input + function gives an output is conventional method of computation.m, where as fundamentally AI or ML or DL computes the function with help of input and known outputs.
Ideally, an 100% learnt AI should result in an function, but community seems to reach the state of consciousness or singularity or universe or self awareness system, which was tried before with conventional programs been tried again with modern program which are all a flipped version of the previous paradigm.
In other words, mathematicians tricked human eyes with their theories with symbols and meanings which only they get right, in my words, a job security for mathematicians.
Input + function gives an output is conventional method of computation.m, where as fundamentally AI or ML or DL computes the function with help of input and known outputs.
Good idea. Inspired by your thoughts, i think maybe the next paradigm for DL is training a parameters-adjustable model that fits different functions.
@XitaoLi it will result in Mandelbrot alike recursion, I mean, whenever recursion is involved, its good idea to snap out of that design and apply some other technique.
How can I load and use the model ('consolidated.00.pth') and test a prompt? I can do it via GPU or CPU but no idea how to run the whole thing.
The .pth files are basically zip files. They should be converted via a convert.py script. See this thread from llama.cpp:
https://github.com/ggerganov/llama.cpp/issues/707
Would love to know the difference between these models now, such as bin, pth, pt, onnx, hf, etc. Or how they convert viewing, etc. Because in the process of using it, I found that many models uesd in llama.cpp cannot be used in the llama2.c project
same here, the LLM revolutio brings a set of new ways/techs to store and use AI models that I don't know necessarly before. if someone has some ressources to read please share!